Support opaque and acquire/release memory semantics #7517

Spencer-Comin · 2024-10-31T23:08:57Z

Expand the possible memory ordering semantics of symbols from volatile or non-volatile to volatile, acquire/release, opaque, or transparent. The memory ordering semantics are defined as follows:

Transparent

Only guaranteed to be bitwise atomic for data 32 bits or smaller and addresses.
This is the same as non-volatile semantics prior to this change.

Opaque

Accesses to opaque symbols are bitwise atomic.
The execution order of all opaque accesses to any given address in a single thread is the same as the program order of accesses to that address.

Acquire/Release

Loads of acquire/release symbols are acquire loads; i.e., loads and stores after a given acquire load will not be reordered to before that load. This matches the semantics of C's memory_order_acquire.
Stores to acquire/release symbols are release stores; i.e., loads and stores before a given release store wil not be reordered to after that store. This matches the semantics of C's memory_order_release.
Acquire/release accesses have a release-acquire ordering.
Acquire/release symbols also have all the same guarantees that opaque symbols have.

Volatile

Volatile accesses have a sequentially-consistent ordering. This matches the semantics of C's memory_order_seq_cst
Volatile symbols also have all the same guarantees that acquire/release symbols have.
This is the same as volatile semantics prior to this change

Additionally, see the notes on memory ordering semantics in the documentation for Java's VarHandle

Spencer-Comin · 2024-10-31T23:10:43Z

OpenJ9 note: This requires a coordinated merge with eclipse-openj9/openj9#20475.

Spencer-Comin · 2024-11-07T15:45:57Z

@hzongaro @zl-wang @0xdaryl @vijaysun-omr FYI

hzongaro

This is a review based on an initial pass through the changes. I'll come back for a more detailed review.

One high-level question I wanted to ask is whether these changes ought to be vetted at an OMR architecture meeting.

compiler/il/Aliases.cpp

compiler/il/OMRSymbol.hpp

compiler/aarch64/codegen/OMRTreeEvaluator.cpp

compiler/arm/codegen/OMRTreeEvaluator.cpp

compiler/il/Aliases.cpp

Spencer-Comin · 2024-11-26T19:19:18Z

Re @hzongaro's question

One high-level question I wanted to ask is whether these changes ought to be vetted at an OMR architecture meeting.

@0xdaryl @vijaysun-omr what do you think?

vijaysun-omr · 2024-12-05T23:20:54Z

Jenkins build all

vijaysun-omr · 2024-12-05T23:22:03Z

I will defer the question on whether we need architecture meeting review to Daryl, but I'll start running tests based on a review I just did.

zl-wang · 2024-12-06T16:24:06Z

compiler/il/OMRSymbol.hpp

+
+      /**
+       * Memory access ordering semantics flags
+       */


it looks like you are mixing up two types of ordering: 1) the order instructions are laid down in memory (more closely related to instruction-dispatching order. that might be what you meant by program order. relevant to compiler optimization.); 2) the order in which memory accesses are executed/observed (more closely related to instruction-issuing or cache RC-machine actioning. need of memory-barriers to enforce a certain order.).

by a quick glimpse of the code, i have a high-level comment for you to consider further: you seemed changing a lot of places querying 2) above into using querying 1) above. is that optimal or even possibly having correctness-implication?

Opaque semantics means that the execution order of an access to the symbol will match the program order as observed by the executing thread.

on weakly-consistent machine (e.g. POWER), you don't have guarantee of the execution order at all without memory barriers. i.e. program order is meaning-less unless we are talking about accesses to the same location (e.g. something can lead to paradoxical situations).

From the Java VarHandle description Opaque semantics description:

Opaque operations are bitwise atomic and coherently ordered with respect to accesses to the same variable.

If I'm understanding those semantics correctly, for Opaque we only need to ensure that accesses to the same variable/address are executed in the same order (as seen by the executing thread) with respect to each other as they are laid down in the program order. If I'm understanding the weakly-consistent machine memory model correctly, the address dependency between the accesses should be enough to ensure this order without memory barriers.

0xdaryl

Thanks for making these changes and thinking through the effects on each architecture.

My main concern is potential confusion around the terms used for the various memory orderings. "Opaque", for example, isn't a well-known term and I think is only used in Java circles. So seeing it appear throughout the code does take some mental adjustment if someone isn't already familiar with the Java definitions. Nevertheless, I am supportive of providing refinement to the kinds of memory ordering the compiler has to deal with, and we have to call them something.

In many places where isOpaque() appears, I think the semantics you're really trying to capture is !isTransparent(), because you're relying on the opaqueness property to be also true for acquire/release and volatile memory orderings. If that's the case, then (to me at least), it might be more readable to use that instead in those places.

compiler/il/OMRSymbol.hpp

compiler/arm/codegen/FPTreeEvaluator.cpp

Spencer-Comin · 2024-12-06T17:25:32Z

@0xdaryl perhaps to be more clear we could steal the naming scheme from LLVM's atomic ordering and rename Volatile to SequentiallyConsistent, Opaque to Monotonic, and Transparent to NonAtomic and use isAtLeastOrStrongerThan* helpers

0xdaryl · 2024-12-06T19:55:55Z

Also, please see the CI failures. There are real build issues in some of them.

0xdaryl · 2024-12-09T17:04:56Z

It looks like LLVM aligns more with the C++ memory model and can accommodate Java semantics too (as well as other frontends). I don't think OMR has to pivot there just yet. Changing the memory model is something worthy of a longer, architectural discussion for sure. I'm happy for us to have that discussion, but I think what you have here is a fine bridge between what we have now (simple vs volatile) and those more granular orderings.

I do like the isAtLeastOrStrongerThan* helpers idea though as it can make the code clearer as well as the author's intents.

hzongaro · 2024-12-09T23:08:14Z

Sorry for the really basic questions, but I'm still trying to understand the semantics in the various cases.

Expand the possible memory ordering semantics of symbols from volatile or non-volatile to volatile, acquire/release, opaque, or transparent. The memory ordering semantics are defined as follows:

Volatile: same as volatile before this change.

Acquire/Release: as defined for C's memory_order_acq_rel

Opaque: accessed in program order, but without any assurance of memory ordering effects on other threads. This is similar to volatile in C or C++.

Transparent: same as non-volatile before this change.

Additionally, see the notes on memory ordering semantics in the documentation for Java's VarHandle

The description of memory_order_acq_rel seems to be specific to operations that perform both a read and write. I assume that a symbol that is marked with TR::Symbol::AcquireReleaseSemantics will be treated like memory_order_acquire in a read operation and like memory_order_release in a write operation - is that correct?

Also, the descriptions of memory ordering semantics for Java's VarHandle states, in part:

In addition to obeying Opaque properties, Acquire mode reads and their subsequent accesses are ordered after matching Release mode writes and their previous accesses.

But if my understanding of the current implementation is correct, TR::Symbol::AcquireReleaseSemantics doesn't seem to imply TR::Symbol::OpaqueSemantics. If so, what parts of the VarHandle documentation should I read and how do they map to the semantics in this implementation?

Spencer-Comin · 2024-12-10T17:06:02Z

Sorry for the really basic questions

@hzongaro if those were basic questions this change would have been a lot easier to implement!

My initial description (and likely my initial understanding) of the memory semantics may not have been 100% accurate. The important parts from the VarHandle description are the descriptions of getVolatile, setVolatile, getOpaque, setOpaque, getAcquire, and setRelease, and the following paragraph:

Access modes control atomicity and consistency properties. Plain read (get) and write (set) accesses are guaranteed to be bitwise atomic only for references and for primitive values of at most 32 bits, and impose no observable ordering constraints with respect to threads other than the executing thread. Opaque operations are bitwise atomic and coherently ordered with respect to accesses to the same variable. In addition to obeying Opaque properties, Acquire mode reads and their subsequent accesses are ordered after matching Release mode writes and their previous accesses. In addition to obeying Acquire and Release properties, all Volatile operations are totally ordered with respect to each other.

I'll update the description of this PR to have a more thorough description of the different semantics.

This change expands the possible memory ordering semantics for a symbol from volatile and non-volatile to volatile, acquire/release, optimization opaque, and transparent. An enum and helper methods are added to facilitate working with memory ordering semantics. Signed-off-by: Spencer Comin <[email protected]>

Spencer-Comin · 2025-01-06T15:08:28Z

@0xdaryl @hzongaro @zl-wang could I get another round of review on this?

hzongaro

Thanks for the all updates that help to clarify the semantics!

I have some questions about whether some cases that are effectively testing for OpaqueSemantics or stronger ought to be testing for AcquireReleaseSemantics or stronger instead.

hzongaro · 2025-01-06T20:22:07Z

compiler/il/OMRSymbol.hpp

+   inline bool isTransparent();
+
+   inline void setOpaque();
+   inline bool isOpaque();
+   inline bool isAtLeastOrStrongerThanOpaque();
+
+   inline void setAcquireRelease();
+   inline bool isAcquireRelease();
+   inline bool isAtLeastOrStrongerThanAcquireRelease();
+
+   inline void setVolatile();
+   inline bool isVolatile();


Is it conceivable that there could be an ordering semantics weaker than TransparentSemantics or one stronger than VolatileSemantics? I'm wondering whether isAtLeastOrStrongerThanTransparent() and isAtLeastOrStrongerThanVolatile() methods would ever make sense. Alternatively, would it make sense to have a single isAtLeastOrStrongerThan(MemoryOrdering) method instead?

The answer can be maybe, but not likely enough to worry about it right now.

The only guarantee that TransparentSemantics gives is address dependency, i.e., that for a single threaded application a read from an address a will read the value written by the most recent (according to program order) write to a. It is conceivable that there could be an ordering semantics that doesn't guarantee address dependency, but no CPU architecture supported by OMR has semantics that weak. If by some odd coincidence we ever support the DEC Alpha or some new architecture comes up that refuses to learn from history, we may have to revisit this.

Looking at what instructions we generate for volatile stores[1] and loads[2], it appears that we generate fully sequentially consistent stores (memory barrier before and after), but only acquire loads (memory barrier only after). Conceivably we could have a stronger ordering semantic that has fully sequentially loads (see godbolt example [3]). Whether supporting full sequential consistency is something we want to do might be worth discussion. I'll look into this some more then update the documentation here for VolatileSemantics if it really is observably different from full sequential consistency.

[1]

omr/compiler/aarch64/codegen/OMRTreeEvaluator.cpp

Lines 6191 to 6291 in 0e72488

TR::Register *commonStoreEvaluator(TR::Node *node, TR::InstOpCode::Mnemonic op, int32_t size, TR::CodeGenerator *cg)

{

TR::MemoryReference *tempMR = TR::MemoryReference::createWithRootLoadOrStore(cg, node);

tempMR->validateImmediateOffsetAlignment(node, size, cg);

bool needSync = (node->getSymbolReference()->getSymbol()->isSyncVolatile() && cg->comp()->target().isSMP());

bool lazyVolatile = false;

if (node->getSymbolReference()->getSymbol()->isShadow() &&

node->getSymbolReference()->getSymbol()->isOrdered() && cg->comp()->target().isSMP())

{

needSync = true;

lazyVolatile = true;

}

TR::Node *valueChild;

if (node->getOpCode().isIndirect())

{

valueChild = node->getSecondChild();

}

else

{

valueChild = node->getFirstChild();

}

if (needSync)

{

generateSynchronizationInstruction(cg, TR::InstOpCode::dmb, node, TR::InstOpCode::ishst);

}

TR::Node *valueChildRoot = NULL;

/*

* Pattern matching compressed refs sequence of address constant NULL

+

* treetop

* istorei

* aload

* l2i (X==0 )

* lushr (compressionSequence )

* a2l

* aconst NULL (X==0 sharedMemory )

* iconst 3

*/

if (cg->comp()->useCompressedPointers() &&

(node->getSymbolReference()->getSymbol()->getDataType() == TR::Address) &&

(valueChild->getDataType() != TR::Address) &&

(valueChild->getOpCodeValue() == TR::l2i) &&

(valueChild->isZero()))

{

TR::Node *tmpNode = valueChild;

while (tmpNode->getNumChildren() && tmpNode->getOpCodeValue() != TR::a2l)

tmpNode = tmpNode->getFirstChild();

if (tmpNode->getNumChildren())

tmpNode = tmpNode->getFirstChild();

if (tmpNode->getDataType().isAddress() && tmpNode->isConstZeroValue() && (tmpNode->getRegister() == NULL))

{

valueChildRoot = valueChild;

}

}

/*

* Use xzr as source register of str instruction

* if valueChild is a compressed refs sequence of address constant NULL,

* or valueChild is a zero constant integer.

*/

if ((valueChildRoot != NULL) || (valueChild->getDataType().isIntegral() && valueChild->isConstZeroValue() && (valueChild->getRegister() == NULL)))

{

TR::Register *zeroReg = cg->allocateRegister();

generateMemSrc1Instruction(cg, op, node, tempMR, zeroReg);

TR::RegisterDependencyConditions *deps = new (cg->trHeapMemory()) TR::RegisterDependencyConditions(0, 1, cg->trMemory());

deps->addPostCondition(zeroReg, TR::RealRegister::xzr);

generateLabelInstruction(cg, TR::InstOpCode::label, node, generateLabelSymbol(cg), deps);

cg->stopUsingRegister(zeroReg);

}

else

{

generateMemSrc1Instruction(cg, op, node, tempMR, cg->evaluate(valueChild));

}

if (needSync)

{

// ordered and lazySet operations will not generate a post-write sync

if (!lazyVolatile)

{

generateSynchronizationInstruction(cg, TR::InstOpCode::dmb, node, TR::InstOpCode::ish);

}

}

if (valueChildRoot != NULL)

{

cg->recursivelyDecReferenceCount(valueChildRoot);

}

else

{

cg->decReferenceCount(valueChild);

}

tempMR->decNodeReferenceCounts(cg);

return NULL;

}

[2]

omr/compiler/aarch64/codegen/OMRTreeEvaluator.cpp

Lines 6070 to 6088 in 0e72488

TR::Register *commonLoadEvaluator(TR::Node *node, TR::InstOpCode::Mnemonic op, int32_t size, TR::Register *targetReg, TR::CodeGenerator *cg)

{

bool needSync = (node->getSymbolReference()->getSymbol()->isSyncVolatile() && cg->comp()->target().isSMP());

node->setRegister(targetReg);

TR::MemoryReference *tempMR = TR::MemoryReference::createWithRootLoadOrStore(cg, node);

tempMR->validateImmediateOffsetAlignment(node, size, cg);

generateTrg1MemInstruction(cg, op, node, targetReg, tempMR);

if (needSync)

{

generateSynchronizationInstruction(cg, TR::InstOpCode::dmb, node, TR::InstOpCode::ishld);

}

tempMR->decNodeReferenceCounts(cg);

return targetReg;

}

[3] https://godbolt.org/z/3PYTc9W3G

hzongaro · 2025-01-06T20:55:18Z

compiler/p/codegen/OMRLoadStoreHandler.cpp

    // Since non-volatiles are implemented as two separate loads, we must use a special sequence to perform the load in
    // a single instruction even when SMP is disabled.
-    else if (node->getSymbol()->isSyncVolatile())
+    else if (node->getSymbol()->isAtLeastOrStrongerThanAcquireRelease())


Will this comment need to be adjusted?

hzongaro · 2025-01-06T21:50:18Z

compiler/il/Aliases.cpp

@@ -315,7 +315,7 @@ OMR::SymbolReference::getUseDefAliasesBV(bool isDirectCall, bool includeGCSafePo
            // (this is the same as before), or if we are unresolved and condy
            // (this is the extra condition added), we would return conservative aliases.
            if ((self()->isUnresolved() && (_symbol->isConstantDynamic() || !_symbol->isConstObjectRef())) ||
-	        _symbol->isVolatile() || self()->isLiteralPoolAddress() ||
+                !_symbol->isTransparent() || self()->isLiteralPoolAddress() ||


I'm wondering whether testing !_symbol->isTransparent() is more conservative than necessary, though it's definitely safe. I'm just trying to understand whether OpaqueSemantics really needs to be treated conservatively here, or if testing for _symbol->isAtLeastOrStrongerThanAcquireRelease() would be sufficient here and elsewhere in this method.

@vijaysun-omr, @zl-wang, thoughts?

hzongaro · 2025-01-06T22:04:32Z

compiler/il/OMRNode.cpp

@@ -2662,7 +2662,7 @@ OMR::Node::mayModifyValue(TR::SymbolReference * symRef)
   TR::Symbol * symbol = symRef->getSymbol();
   if (node->getOpCode().isCall() ||
       node->getOpCodeValue() == TR::monexit ||
-       (node->getOpCode().hasSymbolReference() && node->getSymbol()->isVolatile()) ||
+       (node->getOpCode().hasSymbolReference() && !node->getSymbol()->isTransparent()) ||


Again, is testing !isTransparent() more conservative than necessary? I'm wondering whether isAtLeastOrStrongerThanAcquireRelease() would be sufficient here.

hzongaro · 2025-01-07T16:27:26Z

compiler/il/OMRNode_inlines.hpp

@@ -176,7 +176,7 @@ OMR::Node::mayUse()
 TR_UseDefAliasSetInterface
 OMR::Node::mayKill(bool gcSafe)
   {
-   if (self()->getOpCode().hasSymbolReference() && (self()->getOpCode().isLikeDef() || self()->mightHaveVolatileSymbolReference())) //we want the old behavior in these cases
+   if (self()->getOpCode().hasSymbolReference() && (self()->getOpCode().isLikeDef() || self()->mightHaveNonTransparentSymbolReference())) //we want the old behavior in these cases


Again, I'm wondering whether OpaqueSemantics really needs to be treated conservatively here, and if this could safely check for "at least or stronger than acquire/release". Does this need to worry only about what other threads might do? If so, I don't think the fact that opaque operations need to be performed atomically would matter.

hzongaro · 2025-01-07T16:34:07Z

compiler/optimizer/DeadTreesElimination.cpp

@@ -306,7 +306,7 @@ static bool isSafeToReplaceNode(TR::Node *currentNode, TR::TreeTop *curTreeTop,
       *    => xload/xloadi a.volatileField
       *    ...
       */
-      //if (mayBeVolatileReference && !canMoveIfVolatile)
+      //if (mayBeNonTransparentReference && !canMoveIfVolatile)


Comments that appear just above here talk about restrictions on "swinging down" volatile. Those comments will need to be updated.

hzongaro · 2025-01-07T16:38:22Z

compiler/optimizer/DeadTreesElimination.cpp

+   bool mayBeNonTransparentReference = currentNode->mightHaveNonTransparentSymbolReference();
+   // Do not swing down non-transparent nodes
+   if (mayBeNonTransparentReference)


I don't think this needs to be conservative in the treatment of OpaqueSemantics. I think it really only needs to worry about situations where there could be interactions with other threads, so AcquireReleaseSemantics or stronger.

I suspect the same might hold true for many of the changes for other optimizations, so I think they need to be considered on a case-by-case basis. I won't go through and comment on each.

Spencer-Comin · 2025-01-15T16:05:55Z

@hzongaro re: overly conservative treatment of OpaqueSemantics

Since being overly conservative is still correct (and equivalent to what we already have with volatile/plain), do you think it would be better to get these changes in as-is and then address relaxing the restrictions on individual optimizations in future PRs?

hzongaro · 2025-01-15T18:52:51Z

re: overly conservative treatment of OpaqueSemantics

Since being overly conservative is still correct (and equivalent to what we already have with volatile/plain), do you think it would be better to get these changes in as-is and then address relaxing the restrictions on individual optimizations in future PRs?

Yes, that sounds reasonable. Getting initial support in for the different memory semantics will allow OMR and downstream projects to begin to take advantage of them, even if the treatment is relatively conservative today.

With the expansion of possible memory ordering semantics from binary volatile or non-volatile to volatile, acquire/release, opaque, and transparent, all test whether a symbol is volatile need to be refined depending on the intention of the test, i.e. is it testing if the symbol is strictly volatile, simply opaque, or somewhere in between? Signed-off-by: Spencer Comin <[email protected]>

This change adds arrays for opaque and acquire/release unsafe symrefs to the symbol reference table. Instead of having four separate fields, the fields are combined into an array that can be indexed by the OMR::Symbol::AccessMode enum. Signed-off-by: Spencer Comin <[email protected]>

This flag is removed in OpenJ9. Signed-off-by: Spencer Comin <[email protected]>

Spencer-Comin mentioned this pull request Oct 31, 2024

Implement Unsafe opaque and acquire/release put and get methods eclipse-openj9/openj9#20475

Open

github-actions bot added arch:aarch32 arch:aarch64 comp:compiler labels Oct 31, 2024

Spencer-Comin force-pushed the ordered-opaque branch 4 times, most recently from e353977 to fead69c Compare November 7, 2024 15:24

Spencer-Comin marked this pull request as ready for review November 7, 2024 15:42

Spencer-Comin requested review from 0xdaryl, knn-k, vijaysun-omr and mstoodle as code owners November 7, 2024 15:42

Spencer-Comin requested review from hzongaro and 0xdaryl and removed request for 0xdaryl November 7, 2024 15:44

hzongaro reviewed Nov 20, 2024

View reviewed changes

Spencer-Comin force-pushed the ordered-opaque branch 4 times, most recently from 663a876 to 3a4d76a Compare November 26, 2024 19:10

vijaysun-omr approved these changes Dec 5, 2024

View reviewed changes

0xdaryl self-assigned this Dec 6, 2024

zl-wang reviewed Dec 6, 2024

View reviewed changes

0xdaryl reviewed Dec 6, 2024

View reviewed changes

compiler/il/OMRSymbol.hpp Outdated Show resolved Hide resolved

compiler/il/OMRSymbol.hpp Outdated Show resolved Hide resolved

compiler/arm/codegen/FPTreeEvaluator.cpp Outdated Show resolved Hide resolved

Spencer-Comin force-pushed the ordered-opaque branch from 3a4d76a to edf4efe Compare December 11, 2024 16:20

Spencer-Comin force-pushed the ordered-opaque branch from edf4efe to 0b6b3f1 Compare December 11, 2024 16:37

hzongaro reviewed Jan 7, 2025

View reviewed changes

Spencer-Comin added 3 commits January 17, 2025 12:53

Remove chkDontInlineUnsafePutOrderedCall debug flag print

e6464b4

This flag is removed in OpenJ9. Signed-off-by: Spencer Comin <[email protected]>

Spencer-Comin force-pushed the ordered-opaque branch from 0b6b3f1 to e6464b4 Compare January 17, 2025 17:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support opaque and acquire/release memory semantics #7517

Support opaque and acquire/release memory semantics #7517

Spencer-Comin commented Oct 31, 2024 •

edited

Loading

Spencer-Comin commented Oct 31, 2024

Spencer-Comin commented Nov 7, 2024

hzongaro left a comment

Spencer-Comin commented Nov 26, 2024

vijaysun-omr commented Dec 5, 2024

vijaysun-omr commented Dec 5, 2024

zl-wang Dec 6, 2024

Spencer-Comin Dec 6, 2024

zl-wang Dec 6, 2024 •

edited

Loading

Spencer-Comin Dec 10, 2024

0xdaryl left a comment

Spencer-Comin commented Dec 6, 2024

0xdaryl commented Dec 6, 2024

0xdaryl commented Dec 9, 2024

hzongaro commented Dec 9, 2024

Spencer-Comin commented Dec 10, 2024

Spencer-Comin commented Jan 6, 2025

hzongaro left a comment

hzongaro Jan 6, 2025

Spencer-Comin Jan 15, 2025 •

edited

Loading

hzongaro Jan 6, 2025

hzongaro Jan 6, 2025

hzongaro Jan 6, 2025

hzongaro Jan 7, 2025

hzongaro Jan 7, 2025

hzongaro Jan 7, 2025

Spencer-Comin commented Jan 15, 2025

hzongaro commented Jan 15, 2025

	TR::Register commonStoreEvaluator(TR::Node node, TR::InstOpCode::Mnemonic op, int32_t size, TR::CodeGenerator *cg)
	{
	TR::MemoryReference *tempMR = TR::MemoryReference::createWithRootLoadOrStore(cg, node);
	tempMR->validateImmediateOffsetAlignment(node, size, cg);

	bool needSync = (node->getSymbolReference()->getSymbol()->isSyncVolatile() && cg->comp()->target().isSMP());
	bool lazyVolatile = false;
	if (node->getSymbolReference()->getSymbol()->isShadow() &&
	node->getSymbolReference()->getSymbol()->isOrdered() && cg->comp()->target().isSMP())
	{
	needSync = true;
	lazyVolatile = true;
	}

	TR::Node *valueChild;

	if (node->getOpCode().isIndirect())
	{
	valueChild = node->getSecondChild();
	}
	else
	{
	valueChild = node->getFirstChild();
	}

	if (needSync)
	{
	generateSynchronizationInstruction(cg, TR::InstOpCode::dmb, node, TR::InstOpCode::ishst);
	}

	TR::Node *valueChildRoot = NULL;
	/*
	* Pattern matching compressed refs sequence of address constant NULL
	+
	* treetop
	* istorei
	* aload
	* l2i (X==0 )
	* lushr (compressionSequence )
	* a2l
	* aconst NULL (X==0 sharedMemory )
	* iconst 3
	*/
	if (cg->comp()->useCompressedPointers() &&
	(node->getSymbolReference()->getSymbol()->getDataType() == TR::Address) &&
	(valueChild->getDataType() != TR::Address) &&
	(valueChild->getOpCodeValue() == TR::l2i) &&
	(valueChild->isZero()))
	{
	TR::Node *tmpNode = valueChild;
	while (tmpNode->getNumChildren() && tmpNode->getOpCodeValue() != TR::a2l)
	tmpNode = tmpNode->getFirstChild();
	if (tmpNode->getNumChildren())
	tmpNode = tmpNode->getFirstChild();

	if (tmpNode->getDataType().isAddress() && tmpNode->isConstZeroValue() && (tmpNode->getRegister() == NULL))
	{
	valueChildRoot = valueChild;
	}
	}

	/*
	* Use xzr as source register of str instruction
	* if valueChild is a compressed refs sequence of address constant NULL,
	* or valueChild is a zero constant integer.
	*/
	if ((valueChildRoot != NULL) \|\| (valueChild->getDataType().isIntegral() && valueChild->isConstZeroValue() && (valueChild->getRegister() == NULL)))
	{
	TR::Register *zeroReg = cg->allocateRegister();
	generateMemSrc1Instruction(cg, op, node, tempMR, zeroReg);
	TR::RegisterDependencyConditions *deps = new (cg->trHeapMemory()) TR::RegisterDependencyConditions(0, 1, cg->trMemory());
	deps->addPostCondition(zeroReg, TR::RealRegister::xzr);
	generateLabelInstruction(cg, TR::InstOpCode::label, node, generateLabelSymbol(cg), deps);
	cg->stopUsingRegister(zeroReg);
	}
	else
	{
	generateMemSrc1Instruction(cg, op, node, tempMR, cg->evaluate(valueChild));
	}

	if (needSync)
	{
	// ordered and lazySet operations will not generate a post-write sync
	if (!lazyVolatile)
	{
	generateSynchronizationInstruction(cg, TR::InstOpCode::dmb, node, TR::InstOpCode::ish);
	}
	}

	if (valueChildRoot != NULL)
	{
	cg->recursivelyDecReferenceCount(valueChildRoot);
	}
	else
	{
	cg->decReferenceCount(valueChild);
	}
	tempMR->decNodeReferenceCounts(cg);

	return NULL;
	}

	TR::Register commonLoadEvaluator(TR::Node node, TR::InstOpCode::Mnemonic op, int32_t size, TR::Register targetReg, TR::CodeGenerator cg)
	{
	bool needSync = (node->getSymbolReference()->getSymbol()->isSyncVolatile() && cg->comp()->target().isSMP());

	node->setRegister(targetReg);
	TR::MemoryReference *tempMR = TR::MemoryReference::createWithRootLoadOrStore(cg, node);
	tempMR->validateImmediateOffsetAlignment(node, size, cg);

	generateTrg1MemInstruction(cg, op, node, targetReg, tempMR);

	if (needSync)
	{
	generateSynchronizationInstruction(cg, TR::InstOpCode::dmb, node, TR::InstOpCode::ishld);
	}

	tempMR->decNodeReferenceCounts(cg);

	return targetReg;
	}

Support opaque and acquire/release memory semantics #7517

Are you sure you want to change the base?

Support opaque and acquire/release memory semantics #7517

Conversation

Spencer-Comin commented Oct 31, 2024 • edited Loading

Transparent

Opaque

Acquire/Release

Volatile

Spencer-Comin commented Oct 31, 2024

Spencer-Comin commented Nov 7, 2024

hzongaro left a comment

Choose a reason for hiding this comment

Spencer-Comin commented Nov 26, 2024

vijaysun-omr commented Dec 5, 2024

vijaysun-omr commented Dec 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zl-wang Dec 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0xdaryl left a comment

Choose a reason for hiding this comment

Spencer-Comin commented Dec 6, 2024

0xdaryl commented Dec 6, 2024

0xdaryl commented Dec 9, 2024

hzongaro commented Dec 9, 2024

Spencer-Comin commented Dec 10, 2024

Spencer-Comin commented Jan 6, 2025

hzongaro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Spencer-Comin Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Spencer-Comin commented Jan 15, 2025

hzongaro commented Jan 15, 2025

Spencer-Comin commented Oct 31, 2024 •

edited

Loading

zl-wang Dec 6, 2024 •

edited

Loading

Spencer-Comin Jan 15, 2025 •

edited

Loading