Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to use the new LoadStoreHandler API on Power #11305

Merged
merged 7 commits into from
Feb 17, 2021

Conversation

aviansie-ben
Copy link
Contributor

@aviansie-ben aviansie-ben commented Nov 30, 2020

Recently, OMR has decided to simplify the way that node-based loads and stores are generated on Power due to a large number of issues regarding missing barriers for volatile loads and a great deal of OpenJ9-specific code surrounding volatile and unresolved that has leaked into OMR (eclipse-omr/omr#5630). The current plan is for the old API for directly generating memory references from nodes to be deprecated after the new API is implemented, so OpenJ9 must be updated to use this new API.

This PR should be merged simultaneously with the corresponding OMR PR to implement the new API (eclipse-omr/omr#5652).

@aviansie-ben aviansie-ben added arch:power comp:jit depends:omr Pull request is dependent on a corresponding change in OMR labels Nov 30, 2020
@fjeremic
Copy link
Contributor

Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8

@fjeremic
Copy link
Contributor

fjeremic commented Dec 1, 2020

Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652

@fjeremic fjeremic self-assigned this Dec 1, 2020
@fjeremic
Copy link
Contributor

fjeremic commented Dec 1, 2020

Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652

@aviansie-ben aviansie-ben force-pushed the power-loadstorehandler branch from e09de8f to bc90ee3 Compare December 2, 2020 16:26
@aviansie-ben
Copy link
Contributor Author

Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652

1 similar comment
@fjeremic
Copy link
Contributor

fjeremic commented Dec 2, 2020

Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652

@gita-omr
Copy link
Contributor

gita-omr commented Dec 3, 2020

I would like to review before this PR is merged.

@gita-omr
Copy link
Contributor

gita-omr commented Dec 3, 2020

@0xdaryl @fjeremic could you please add me as a reviewer?

@0xdaryl 0xdaryl requested a review from gita-omr December 3, 2020 14:28
@fjeremic fjeremic added depends:omr:breaking and removed depends:omr Pull request is dependent on a corresponding change in OMR labels Dec 3, 2020
@aviansie-ben
Copy link
Contributor Author

Jenkins test sanity.functional,extended.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652

@fjeremic
Copy link
Contributor

fjeremic commented Dec 4, 2020

Jenkins test special.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652

1 similar comment
@aviansie-ben
Copy link
Contributor Author

Jenkins test special.system aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652

@fjeremic
Copy link
Contributor

fjeremic commented Dec 8, 2020

Jenkins test sanity.functional,extended.functional,sanity.system,extended.system,special.system,sanity.openjdk aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652

@aviansie-ben aviansie-ben force-pushed the power-loadstorehandler branch from 79be1d9 to 24dc2ba Compare December 10, 2020 21:06
@aviansie-ben
Copy link
Contributor Author

Jenkins test sanity.functional,extended.functional,sanity.system,extended.system,special.system,sanity.openjdk aix,aixxl,plinux,plinuxxl jdk8 depends eclipse-omr/omr#5652

@aviansie-ben
Copy link
Contributor Author

Jenkins test sanity.functional,extended.functional,sanity.system,extended.system,special.system,sanity.openjdk plinuxxl jdk8 depends eclipse-omr/omr#5652

@aviansie-ben
Copy link
Contributor Author

The only remaining CI failures seem to be duplicates of #11104 and #11192, so this PR should be ready to go now.

@aviansie-ben aviansie-ben changed the title WIP: Update to use the new LoadStoreHandler API on Power Update to use the new LoadStoreHandler API on Power Dec 11, 2020
@aviansie-ben aviansie-ben marked this pull request as ready for review December 11, 2020 17:52
@aviansie-ben aviansie-ben force-pushed the power-loadstorehandler branch from 24dc2ba to 825ae51 Compare December 17, 2020 20:05

TR::addDependency(conditions, temp1Reg, TR::RealRegister::gr11, TR_GPR, cg);
TR::addDependency(conditions, dstReg, TR::RealRegister::NoReg, TR_GPR, cg);
conditions->getPreConditions()->getRegisterDependency(conditions->getAddCursorForPre() - 1)->setExcludeGPR0();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will cause some register shuffling since before dependencies were shared between the array check store and write barrier helpers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think removing these explicit dependencies should cause any additional register shuffles. The code generated for the write barrier comes after the label with these dependencies, meaning that the more specific dependencies from the write barrier would be encountered by the local register allocator before these ones.

@gita-omr
Copy link
Contributor

Reviewed "Simplify ArrayStoreCHK evaluator " commit pretty thoroughly. I think avoiding the duplication is worth minor sacrifice in the performance. However, since it might affect the performance it would be good to have it in a separate build(and therefore PR) for easier tracking.

@aviansie-ben
Copy link
Contributor Author

@gita-omr I've opened #11530 with the ArrayStoreCHK simplification alone. For the time being, that change still exists in this PR as removing it would cause pretty bad merge conflicts. I'll rebase this change to remove that commit once that PR is merged.

@gita-omr
Copy link
Contributor

gita-omr commented Jan 5, 2021

@zl-wang I think I need your help to review.

@gita-omr gita-omr added this to the Release 0.25 (Java 16) milestone Jan 5, 2021
@gita-omr
Copy link
Contributor

gita-omr commented Jan 5, 2021

We need to work towards completing it for 0.25 (early February code split) in order to have a permanent fix for #11511

@gita-omr gita-omr requested a review from zl-wang January 5, 2021 01:08
Copy link
Contributor

@zl-wang zl-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these changes just narrowed the occasions for explicit prefetch to kick in. It is not true to the original intention.

@@ -403,7 +403,7 @@ J9::Power::CodeGenerator::insertPrefetchIfNecessary(TR::Node *node, TR::Register
static bool disableStringObjPrefetch = (feGetEnv("TR_DisableStringObjPrefetch") != NULL);
bool optDisabled = false;

if (node->getOpCodeValue() == TR::aloadi ||
if ((node->getOpCodeValue() == TR::aloadi && !comp->target().is64Bit()) ||
(comp->target().is64Bit() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like conditions being misplaced originally. This current fix certainly is not in line with the intention of prefetch optimization. Prefetch surely is beneficial performance-wise to 64bit non-compressed.

The conditions should read something like: if (is-objectReference-load && hotness-testing && CPU-testing)

Applies to the next change too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that prefetching certainly could be applied in these circumstances, extending the prefetching optimization to handle this case correctly is out-of-scope for this PR. As the commit message clarifies, the previous implementation was not performing prefetching for such nodes either, but the way this was being achieved was slightly different. This change does not make prefetching not occur where it previously would, but simply makes insertPrefetchIfNecessary safe to call on aloadi nodes under 64-bit, as it would otherwise segfault.

Previously, we were simply failing to call insertPrefetchIfNecessary at all under the problematic conditions, thus sidestepping the problem. In commoning up code between the load evaluators into LoadStoreHandler, all of these evaluators need to behave uniformly and thus now call insertPrefetchIfNecessary unconditionally, with it being up to that method alone to make the decision as to whether to perform prefetching. This also avoids leaking implementation details of OpenJ9's prefetching into OMR.

If you'd like, I can open an issue for extending this support in the future, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

certainly need a follow-up item

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #11803

@aviansie-ben aviansie-ben force-pushed the power-loadstorehandler branch from 825ae51 to 688a9f3 Compare January 5, 2021 19:21
@aviansie-ben
Copy link
Contributor Author

Rebased this PR (and the corresponding OMR PR) to remove the commit that was split into #11530.

@zl-wang
Copy link
Contributor

zl-wang commented Jan 5, 2021

Remember to test SPECjbb2005 (yes, it is 2005) with -Xaggressive option (where, at least, i knew explicit prefetch kicking in)

@aviansie-ben
Copy link
Contributor Author

@zl-wang As an external contributor, I do not have access to test SPECjbb2005. I am, however, under the impression that prefetching has been exercised on our supported platforms during testing of this PR, as that fix had to be added specifically to correct a segfault that came up during CI testing (though I don't remember specifically which test was failing).

loadOpCode = TR::InstOpCode::lwz;
}
TR_ASSERT_FATAL_WITH_NODE(nextTopNode, !tempMR.getMemoryReference()->getIndexRegister(), "Simple read monitors do not currently support indexed loads");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zl-wang is there any chance this assert can get triggered?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since TR::LoadStoreHandler::generateSimpleLoadMemoryReference doesn't guarantee immediate-form addressing, it looks like potentially this assert can trigger. On the other hand, the existing code didn't handle the indexRegister anyway (if it were present), so that it could be wrong already but not caught with assert. I would suggest removing this assert and staying the previous assumption for the time being, and open a follow-up item to handle the indexRegsiter if it becomes present. as it is, it likely works even if indexRegister is present. But this newly added assert will be a flare-up regression in the field (as the ConstantDataSnippet assert was). @aviansie-ben

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since TR::LoadStoreHandler::generateSimpleLoadMemoryReference doesn't guarantee immediate-form addressing, it looks like potentially this assert can trigger

In practice, this assert should never fire since the code above already checks that nextTopNode->getFirstChild() has been previously evaluated and generateSimpleLoadMemoryReference will never generate an index-form memory reference under these conditions unless one is explicitly requested. This assert is being added to document this assumption and to guard against any future changes that might violate it.

as it is, it likely works even if indexRegister is present

The consequences of reaching this code with an index-form memory reference could be severe and result in very difficult-to-diagnose issues and possible security vulnerabilities. Firstly, the ICF register dependencies would not include the index register, so we'd potentially get spills in ICF. Secondly and much more worryingly, PPCReadMonitorSnippet (which is used as a fallback if the lock is already held when this code is entered) will always generate an immediate-form load and so would load at an offset of 0 from the base register, which will predictably load the J9Class of the object. If an attacker could cause a race condition to occur, they could reliably force a non-zero result from such a load, which could be a big problem if the value being loaded is a security-critical boolean.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, we agree the assert has no much value in catching a problematic case, but more for future-proof which i suggested opening a new PR to be addressed instead. for this merge in particular, it is just better off to get rid of this assert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we clearly do not agree. The whole point of an assert is to assert that the stated condition should not happen and to make sure that the compiler fails in a predictable manner if it ever does rather than simply generating garbage code that could wreak havoc on the running application. There is no need to open a new issue or PR here because this is a case that shouldn't happen and thus does not need to be handled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem is: it is predictably fatal if it triggers. we are very assert-wary ... high cost of service. as a matter of fact as of today, there are still outstanding assert PMRs for newly added assert to be addressed (iFix delivery) in some branches.

it is not a regression without this assert (existing bug if indexed form), and it is a new regression with the assert. better way moving forward: either you make it work for all cases, or we will fix the existing bug in the near future. we just don't want to deal with the potentially new regression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, I find your reasoning here to be misguided. IMO it is much better to crash in a predictable manner than to potentially exhibit wrong behaviour much later on and it being "not a regression" to exhibit wrong behaviour is a poor excuse. However, in the interests of moving this PR forward, I have removed the assert in this particular case. Do note that I don't intend on letting excuses of this nature fly in OMR, though, and I will continue to ensure that asserts are added as appropriate there.

loadOpCode = TR::InstOpCode::lwz;
}
TR_ASSERT_FATAL_WITH_NODE(nextTopNode, !tempMR.getMemoryReference()->getIndexRegister(), "Simple read monitors do not currently support indexed loads");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since TR::LoadStoreHandler::generateSimpleLoadMemoryReference doesn't guarantee immediate-form addressing, it looks like potentially this assert can trigger. On the other hand, the existing code didn't handle the indexRegister anyway (if it were present), so that it could be wrong already but not caught with assert. I would suggest removing this assert and staying the previous assumption for the time being, and open a follow-up item to handle the indexRegsiter if it becomes present. as it is, it likely works even if indexRegister is present. But this newly added assert will be a flare-up regression in the field (as the ConstantDataSnippet assert was). @aviansie-ben

Previously, the code in the insertPrefetchIfNecessary method was
expecting that it would only see aloadi nodes on 32-bit. Because of
this, the code would not operate correctly and would cause segfaults
during compilation. This was previously being handled by simply not
calling insertPrefetchIfNecessary under these conditions, however
keeping this behaviour requires that OMR know about implementation
details of OpenJ9's prefetching.

While this code could theoretically be made to work under these
conditions, the code as written simply cannot handle this case. Instead,
a check has been added to prevent insertPrefetchIfNecessary from
operating on aloadi nodes on 64-bit to replicate the old behaviour.

Signed-off-by: Ben Thomas <[email protected]>
Previously, the code for performing write barriers was using the old
MemoryReference-based API for performing its stores. This API is being
deprecated in OMR, so all code dealing with write barriers has been
updated to use the new LoadStoreHandler API.

Signed-off-by: Ben Thomas <[email protected]>
Previously, the code for performing read barriers was using the old
MemoryReference-based API for performing its loads. This API is being
deprecated in OMR, so all code dealing with read barriers has been
updated to use the new LoadStoreHandler API.

Unfortunately, this code cannot be updated in its current form to use
LoadStoreHandler as intended since it unavoidably performs the load
during ICF. Thus, it currently uses generateComputeAddressSequence to
compute the address but does not use generateLoadAddressSequence to
perform the load. This is safe since OpenJ9 is the most derived project
that could possibly overload LoadStoreHandler and emitting unnecessary
sync instructions (while bad for performance) is not incorrect.

This code should be revisited later to make it more idiomatic, but that
is beyond the scope of this particular change.

Signed-off-by: Ben Thomas <[email protected]>
Previously, the code for emitting simple read monitors was using the old
MemoryReference-based API for performing the load. This API is being
deprecated in OMR, so all code dealing with read barriers has been
updated to use the new LoadStoreHandler API.

Signed-off-by: Ben Thomas <[email protected]>
@aviansie-ben aviansie-ben force-pushed the power-loadstorehandler branch from dfaf3b8 to f8d77f9 Compare February 11, 2021 05:22
@fjeremic
Copy link
Contributor

Jenkins test sanity all jdk8,jdk11 depends eclipse-omr/omr#5652

@fjeremic fjeremic merged commit 5e2594c into eclipse-openj9:master Feb 17, 2021
rmnattas added a commit to rmnattas/openj9 that referenced this pull request Mar 8, 2021
The refactor of the POWER awrtbariEvaluator as a part of the POWER
LoadStore refactoring work (eclipse-openj9#11305) switched the usage of
sourceRegister and storeRegister which holds the full and compressed
refs respectively when using compressedRefs.

Signed-off-by: Abdulrahman Alattas <[email protected]>
rmnattas added a commit to rmnattas/openj9 that referenced this pull request Mar 10, 2021
The refactor of the POWER awrtbariEvaluator as a part of the POWER
LoadStore refactoring work (eclipse-openj9#11305) switched the usage of
sourceRegister and storeRegister which holds the full and compressed
refs respectively when using compressedRefs.

Signed-off-by: Abdulrahman Alattas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants