-
Notifications
You must be signed in to change notification settings - Fork 733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to use the new LoadStoreHandler API on Power #11305
Update to use the new LoadStoreHandler API on Power #11305
Conversation
Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8 |
Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652 |
Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652 |
e09de8f
to
bc90ee3
Compare
Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652 |
1 similar comment
Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652 |
I would like to review before this PR is merged. |
Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652 |
Jenkins test special.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652 |
1 similar comment
Jenkins test special.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652 |
Jenkins test sanity.functional,extended.functional,sanity.system,extended.system,special.system,sanity.openjdk aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652 |
79be1d9
to
24dc2ba
Compare
Jenkins test sanity.functional,extended.functional,sanity.system,extended.system,special.system,sanity.openjdk aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652 |
Jenkins test sanity.functional,extended.functional,sanity.system,extended.system,special.system,sanity.openjdk plinuxxl jdk8 depends eclipse-omr/omr#5652 |
24dc2ba
to
825ae51
Compare
|
||
TR::addDependency(conditions, temp1Reg, TR::RealRegister::gr11, TR_GPR, cg); | ||
TR::addDependency(conditions, dstReg, TR::RealRegister::NoReg, TR_GPR, cg); | ||
conditions->getPreConditions()->getRegisterDependency(conditions->getAddCursorForPre() - 1)->setExcludeGPR0(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will cause some register shuffling since before dependencies were shared between the array check store and write barrier helpers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think removing these explicit dependencies should cause any additional register shuffles. The code generated for the write barrier comes after the label with these dependencies, meaning that the more specific dependencies from the write barrier would be encountered by the local register allocator before these ones.
Reviewed "Simplify ArrayStoreCHK evaluator " commit pretty thoroughly. I think avoiding the duplication is worth minor sacrifice in the performance. However, since it might affect the performance it would be good to have it in a separate build(and therefore PR) for easier tracking. |
@zl-wang I think I need your help to review. |
We need to work towards completing it for 0.25 (early February code split) in order to have a permanent fix for #11511 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these changes just narrowed the occasions for explicit prefetch to kick in. It is not true to the original intention.
@@ -403,7 +403,7 @@ J9::Power::CodeGenerator::insertPrefetchIfNecessary(TR::Node *node, TR::Register | |||
static bool disableStringObjPrefetch = (feGetEnv("TR_DisableStringObjPrefetch") != NULL); | |||
bool optDisabled = false; | |||
|
|||
if (node->getOpCodeValue() == TR::aloadi || | |||
if ((node->getOpCodeValue() == TR::aloadi && !comp->target().is64Bit()) || | |||
(comp->target().is64Bit() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like conditions being misplaced originally. This current fix certainly is not in line with the intention of prefetch optimization. Prefetch surely is beneficial performance-wise to 64bit non-compressed.
The conditions should read something like: if (is-objectReference-load && hotness-testing && CPU-testing)
Applies to the next change too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree that prefetching certainly could be applied in these circumstances, extending the prefetching optimization to handle this case correctly is out-of-scope for this PR. As the commit message clarifies, the previous implementation was not performing prefetching for such nodes either, but the way this was being achieved was slightly different. This change does not make prefetching not occur where it previously would, but simply makes insertPrefetchIfNecessary
safe to call on aloadi
nodes under 64-bit, as it would otherwise segfault.
Previously, we were simply failing to call insertPrefetchIfNecessary
at all under the problematic conditions, thus sidestepping the problem. In commoning up code between the load evaluators into LoadStoreHandler
, all of these evaluators need to behave uniformly and thus now call insertPrefetchIfNecessary
unconditionally, with it being up to that method alone to make the decision as to whether to perform prefetching. This also avoids leaking implementation details of OpenJ9's prefetching into OMR.
If you'd like, I can open an issue for extending this support in the future, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
certainly need a follow-up item
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #11803
825ae51
to
688a9f3
Compare
Rebased this PR (and the corresponding OMR PR) to remove the commit that was split into #11530. |
Remember to test SPECjbb2005 (yes, it is 2005) with -Xaggressive option (where, at least, i knew explicit prefetch kicking in) |
@zl-wang As an external contributor, I do not have access to test SPECjbb2005. I am, however, under the impression that prefetching has been exercised on our supported platforms during testing of this PR, as that fix had to be added specifically to correct a segfault that came up during CI testing (though I don't remember specifically which test was failing). |
loadOpCode = TR::InstOpCode::lwz; | ||
} | ||
TR_ASSERT_FATAL_WITH_NODE(nextTopNode, !tempMR.getMemoryReference()->getIndexRegister(), "Simple read monitors do not currently support indexed loads"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zl-wang is there any chance this assert can get triggered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since TR::LoadStoreHandler::generateSimpleLoadMemoryReference doesn't guarantee immediate-form addressing, it looks like potentially this assert can trigger. On the other hand, the existing code didn't handle the indexRegister anyway (if it were present), so that it could be wrong already but not caught with assert. I would suggest removing this assert and staying the previous assumption for the time being, and open a follow-up item to handle the indexRegsiter if it becomes present. as it is, it likely works even if indexRegister is present. But this newly added assert will be a flare-up regression in the field (as the ConstantDataSnippet assert was). @aviansie-ben
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since TR::LoadStoreHandler::generateSimpleLoadMemoryReference doesn't guarantee immediate-form addressing, it looks like potentially this assert can trigger
In practice, this assert should never fire since the code above already checks that nextTopNode->getFirstChild()
has been previously evaluated and generateSimpleLoadMemoryReference
will never generate an index-form memory reference under these conditions unless one is explicitly requested. This assert is being added to document this assumption and to guard against any future changes that might violate it.
as it is, it likely works even if indexRegister is present
The consequences of reaching this code with an index-form memory reference could be severe and result in very difficult-to-diagnose issues and possible security vulnerabilities. Firstly, the ICF register dependencies would not include the index register, so we'd potentially get spills in ICF. Secondly and much more worryingly, PPCReadMonitorSnippet
(which is used as a fallback if the lock is already held when this code is entered) will always generate an immediate-form load and so would load at an offset of 0 from the base register, which will predictably load the J9Class
of the object. If an attacker could cause a race condition to occur, they could reliably force a non-zero result from such a load, which could be a big problem if the value being loaded is a security-critical boolean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, we agree the assert has no much value in catching a problematic case, but more for future-proof which i suggested opening a new PR to be addressed instead. for this merge in particular, it is just better off to get rid of this assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we clearly do not agree. The whole point of an assert is to assert that the stated condition should not happen and to make sure that the compiler fails in a predictable manner if it ever does rather than simply generating garbage code that could wreak havoc on the running application. There is no need to open a new issue or PR here because this is a case that shouldn't happen and thus does not need to be handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem is: it is predictably fatal if it triggers. we are very assert-wary ... high cost of service. as a matter of fact as of today, there are still outstanding assert PMRs for newly added assert to be addressed (iFix delivery) in some branches.
it is not a regression without this assert (existing bug if indexed form), and it is a new regression with the assert. better way moving forward: either you make it work for all cases, or we will fix the existing bug in the near future. we just don't want to deal with the potentially new regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, I find your reasoning here to be misguided. IMO it is much better to crash in a predictable manner than to potentially exhibit wrong behaviour much later on and it being "not a regression" to exhibit wrong behaviour is a poor excuse. However, in the interests of moving this PR forward, I have removed the assert in this particular case. Do note that I don't intend on letting excuses of this nature fly in OMR, though, and I will continue to ensure that asserts are added as appropriate there.
loadOpCode = TR::InstOpCode::lwz; | ||
} | ||
TR_ASSERT_FATAL_WITH_NODE(nextTopNode, !tempMR.getMemoryReference()->getIndexRegister(), "Simple read monitors do not currently support indexed loads"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since TR::LoadStoreHandler::generateSimpleLoadMemoryReference doesn't guarantee immediate-form addressing, it looks like potentially this assert can trigger. On the other hand, the existing code didn't handle the indexRegister anyway (if it were present), so that it could be wrong already but not caught with assert. I would suggest removing this assert and staying the previous assumption for the time being, and open a follow-up item to handle the indexRegsiter if it becomes present. as it is, it likely works even if indexRegister is present. But this newly added assert will be a flare-up regression in the field (as the ConstantDataSnippet assert was). @aviansie-ben
Previously, the code in the insertPrefetchIfNecessary method was expecting that it would only see aloadi nodes on 32-bit. Because of this, the code would not operate correctly and would cause segfaults during compilation. This was previously being handled by simply not calling insertPrefetchIfNecessary under these conditions, however keeping this behaviour requires that OMR know about implementation details of OpenJ9's prefetching. While this code could theoretically be made to work under these conditions, the code as written simply cannot handle this case. Instead, a check has been added to prevent insertPrefetchIfNecessary from operating on aloadi nodes on 64-bit to replicate the old behaviour. Signed-off-by: Ben Thomas <[email protected]>
Signed-off-by: Ben Thomas <[email protected]>
Previously, the code for performing write barriers was using the old MemoryReference-based API for performing its stores. This API is being deprecated in OMR, so all code dealing with write barriers has been updated to use the new LoadStoreHandler API. Signed-off-by: Ben Thomas <[email protected]>
Previously, the code for performing read barriers was using the old MemoryReference-based API for performing its loads. This API is being deprecated in OMR, so all code dealing with read barriers has been updated to use the new LoadStoreHandler API. Unfortunately, this code cannot be updated in its current form to use LoadStoreHandler as intended since it unavoidably performs the load during ICF. Thus, it currently uses generateComputeAddressSequence to compute the address but does not use generateLoadAddressSequence to perform the load. This is safe since OpenJ9 is the most derived project that could possibly overload LoadStoreHandler and emitting unnecessary sync instructions (while bad for performance) is not incorrect. This code should be revisited later to make it more idiomatic, but that is beyond the scope of this particular change. Signed-off-by: Ben Thomas <[email protected]>
Previously, the code for emitting simple read monitors was using the old MemoryReference-based API for performing the load. This API is being deprecated in OMR, so all code dealing with read barriers has been updated to use the new LoadStoreHandler API. Signed-off-by: Ben Thomas <[email protected]>
Signed-off-by: Ben Thomas <[email protected]>
Signed-off-by: Ben Thomas <[email protected]>
dfaf3b8
to
f8d77f9
Compare
Jenkins test sanity all jdk8,jdk11 depends eclipse-omr/omr#5652 |
The refactor of the POWER awrtbariEvaluator as a part of the POWER LoadStore refactoring work (eclipse-openj9#11305) switched the usage of sourceRegister and storeRegister which holds the full and compressed refs respectively when using compressedRefs. Signed-off-by: Abdulrahman Alattas <[email protected]>
The refactor of the POWER awrtbariEvaluator as a part of the POWER LoadStore refactoring work (eclipse-openj9#11305) switched the usage of sourceRegister and storeRegister which holds the full and compressed refs respectively when using compressedRefs. Signed-off-by: Abdulrahman Alattas <[email protected]>
Recently, OMR has decided to simplify the way that node-based loads and stores are generated on Power due to a large number of issues regarding missing barriers for volatile loads and a great deal of OpenJ9-specific code surrounding volatile and unresolved that has leaked into OMR (eclipse-omr/omr#5630). The current plan is for the old API for directly generating memory references from nodes to be deprecated after the new API is implemented, so OpenJ9 must be updated to use this new API.
This PR should be merged simultaneously with the corresponding OMR PR to implement the new API (eclipse-omr/omr#5652).