Inconsistent cmake settings for TLH_PREFETCH_FTA #18870

knn-k · 2024-02-01T09:26:48Z

There are flags OMR_GC_TLH_PREFETCH_FTA and J9VM_GC_TLH_PREFETCH_FTA.
Settings of these flags in cmake files are complicated, and look inconsistent for some platforms.

runtime/cmake/caches/common.cmake enables J9VM_GC_TLH_PREFETCH_FTA. The file does not have an entry for OMR_GC_TLH_PREFETCH_FTA.

openj9/runtime/cmake/caches/common.cmake

Line 149 in ac66d8c

set(J9VM_GC_TLH_PREFETCH_FTA ON CACHE BOOL "")

Platform	OMR_GC_TLH_PREFETCH_FTA	J9VM_GC_TLH_PREFETCH_FTA
linux_x86-64	ON (explicit)	ON (common)
aix_ppc-64	(no setting)	OFF (explicit)
linux_ppc-64_le	OFF (explicit)	ON (common)
zos_390-64	OFF (explicit)	OFF (explicit)
linux_390-64	(no setting)	ON (common)
linux_aarch64	(no setting)	ON (common)

The text was updated successfully, but these errors were encountered:

knn-k · 2024-02-01T09:46:39Z

It seems the OMR_GC_TLH_PREFETCH_FTA value in the generated omrcfg.h reflects the J9VM_GC_TLH_PREFETCH_FTA value, as far as I tried on AArch64 Linux turning the J9VM_ flag on and off.

What happens on linux_ppc-64_le, where J9VM_GC_TLH_PREFETCH_FTA is ON and OMR_GC_TLH_PREFETCH_FTA is OFF?
Can anybody look at OMR_GC_TLH_PREFETCH_FTA in the omrcfg.h for linux_ppc-64_le, please?

knn-k · 2024-02-01T10:47:46Z

OMR_GC_TLH_PREFETCH_FTA is defined in the generated omrcfg.h when J9VM_GC_TLH_PREFETCH_FTA is ON and OMR_GC_TLH_PREFETCH_FTA is OFF in cmake files. J9VM_GC_TLH_PREFETCH_FTA setting seems to override OMR_GC_TLH_PREFETCH_FTA.

I think setting the OMR_GC_TLH_PREFETCH_FTA value in cmake files makes it confusing.
How about removing OMR_GC_TLH_PREFETCH_FTA lines from cmake files?

knn-k · 2024-02-01T23:44:58Z

@zl-wang Do you have any thoughts on:

Why linux_ppc-64_le sets OMR_GC_TLH_PREFETCH_FTA to OFF while J9VM_GC_TLH_PREFETCH_FTA is ON
Why aix_ppc-64 sets J9VM_GC_TLH_PREFETCH_FTA to OFF

linux_ppc-64_le:

openj9/runtime/cmake/caches/linux_ppc-64_le.cmake

Line 42 in edbf0e1

set(OMR_GC_TLH_PREFETCH_FTA OFF CACHE BOOL "")

aix_ppc-64:

openj9/runtime/cmake/caches/aix_ppc-64.cmake

Line 34 in edbf0e1

set(J9VM_GC_TLH_PREFETCH_FTA OFF CACHE BOOL "")

knn-k · 2024-02-02T03:13:56Z

AArch64 platforms (Linux and macOS): J9VM_GC_TLH_PREFETCH_FTA is enabled, but the JIT compiler does not use the tlhPrefetchFTA field in J9VMThread at all.

zl-wang · 2024-02-02T15:03:02Z

It looks to me that it is off in all cases: (my search results below)
FILE: /team/zlwang/repositories/openj9/runtime/cmake/caches/aix_ppc-64.cmake

set(J9VM_GC_TLH_PREFETCH_FTA OFF CACHE BOOL "")
FILE: /team/zlwang/repositories/openj9/runtime/cmake/caches/linux_ppc-64_le.cmake

set(OMR_GC_TLH_PREFETCH_FTA OFF CACHE BOOL "")

It is indeed confusing. Why it is even relevant to OMR for a prefetch scheme dependent on J9VMThread support. In my opinion, this should not be visible in OMR side.

dmitripivkine · 2024-02-02T15:16:07Z

It is indeed confusing. Why it is even relevant to OMR for a prefetch scheme dependent on J9VMThread support. In my opinion, this should not be visible in OMR side.

I think it was an attempt to provide definition in OMR because GC TLH code lives there (in combination with desire to keep OMR independent upstream project). So, either we need to keep OMR_* and J9VM_* settings synchronized or use compromise and check J9VM_GC_TLH_PREFETCH_FTA in OMR code directly. Quick search shows we do have J9 definitions usage in OMR code already.

knn-k · 2024-02-02T22:50:54Z

@zl-wang Note that the symbol in aix_ppc-64.cmake starts with J9VM_, and the one in linux_ppc-64_le.cmake starts with OMR_. They are different symbols.
Please also look at the definitions of those symbols in j9cfg.h and omrcfg.h that are generated during the build.

knn-k · 2024-02-07T03:15:04Z

The following files in buildspecs set the gc_tlhPrefetchFTA flag:

linux_aarch64
linux_arm
linux_riscv64
linux_x86-64
linux_x86
osx_x86-64
win_x86-64
win_x86

linux_ppc-64_le.spec and linux_390-64.spec don't set the flag, which are inconsistent with the setting in common.cmake.

knn-k · 2024-02-07T07:42:40Z

Looking back the histories of .cmake files:

Power

aix_ppc-64.cmake: J9VM_GC_TLH_PREFETCH_FTA was turned off by CMake: Add missing options to aix cache #9142.
linux_ppc-64_le.cmake: OMR_GC_TLH_PREFETCH_FTA was turned off by the initial cmake file for plinux CMake: Bring up plinux #2566. J9VM_GC_TLH_PREFETCH_FTA is enabled by common.cmake.

Z

zos_390-64.cmake: The two symbols are turned off by the initial cmake file for zos by CMake: Add initial zos caches #11994
linux_390-64.cmake: No specific settings for the two symbols. J9VM_GC_TLH_PREFETCH_FTA is enabled by common.cmake.

knn-k · 2024-02-07T08:01:46Z

@joransiu FYI.

knn-k · 2024-02-08T04:19:16Z

I ran personal builds to see if OMR_GC_TLH_PREFETCH_FTA is defined or not on Power and Z. AIX, z/OS, and Linux have different values as shown below.

ppc64_aix: Undefined
ppc64le_linux: Defined (The cmake file sets it to OFF, but that is ignored)
s390x_linux: Defined
s390x_zos: Undefined

(internal job: view/OpenJ9%20-%20Personal/job/Pipeline-Build-Test-Personal/20537/)

zl-wang · 2024-02-08T13:46:29Z

we don't need to set the macro(s) consistently across platforms, but of course setting them consistently on each single platform is a good step to take.

In order to realize the performance benefit of TLH-prefetch which has been dropped out of favour gradually in the past while, you need to walk a fine line anyway between micro-architecture and codegen. firstly, if you can gain benefit from TLH batch clearing which is default on p at least, TLH-prefetch is an overlapping action by and large. secondly, prefetch instructions with temporal characteristics were the ones having performance benefits from past experiences (since hardware prefetch does pretty much the same thing as explicit vanilla prefetch instructions and without consuming instruction cache capacity).

knn-k · 2024-02-08T14:01:28Z

@zl-wang I can change PR #18918 not to disable J9VM_GC_TLH_PREFETCH_FTA in linux_ppc-64_le.cmake. Is that what you mean, as the first step?

dmitripivkine · 2024-02-08T14:16:23Z

if you can gain benefit from TLH batch clearing which is default on p at least, TLH-prefetch is an overlapping action by and large

Good point. As I understand we are about to enable TLH batch clearing on X and ARM. So, relation between TLH batch clearing and TLH-prefetch should be considered.

knn-k · 2024-02-09T01:11:23Z

I updated PR #18918. It does not change J9VM_GC_TLH_PREFETCH_FTA in linux_ppc-64_le.cmake and in linux_390-64.cmake now.

knn-k · 2024-02-09T01:18:11Z

See J9VM_GC_TLH_PREFETCH_FTA settings after the change. The flag is turned off only on aix and zos.

Platform	J9VM_GC_TLH_PREFETCH_FTA
x86 and x86-64 platforms	ON (common)
aarch64 platforms	ON (common)
aix_ppc-64	OFF
linux_ppc-64_le	ON (common)
zos_390-64	OFF
linux_390-64	ON (common)

knn-k · 2024-02-09T01:44:37Z

As I understand we are about to enable TLH batch clearing on X and ARM.

TLH batch clearing is enabled by default on AArch64.
common.cmake enables J9VM_GC_BATCH_CLEAR_TLH on all the platforms.

openj9/runtime/cmake/caches/common.cmake

Line 121 in a348669

set(J9VM_GC_BATCH_CLEAR_TLH ON CACHE BOOL "")

On x86 platforms, you need to set the env variable TR_EnableBatchClear to allocate zeroed TLH pages. Other platforms have the env variable TR_DisableBatchClear.

openj9/runtime/compiler/control/DLLMain.cpp

Lines 359 to 360 in a348669

    
           #ifdef TR_HOST_X86 
        
                       static char *enableBatchClear = feGetEnv2("TR_EnableBatchClear", (void *)vm);

dmitripivkine · 2024-02-09T14:33:06Z

As I understand we are about to enable TLH batch clearing on X and ARM.

TLH batch clearing is enabled by default on AArch64. common.cmake enables J9VM_GC_BATCH_CLEAR_TLH on all the platforms.

openj9/runtime/cmake/caches/common.cmake

Line 121 in a348669

set(J9VM_GC_BATCH_CLEAR_TLH ON CACHE BOOL "")

On x86 platforms, you need to set the env variable TR_EnableBatchClear to allocate zeroed TLH pages. Other platforms have the env variable TR_DisableBatchClear.

openj9/runtime/compiler/control/DLLMain.cpp

Lines 359 to 360 in a348669

#ifdef TR_HOST_X86

static char *enableBatchClear = feGetEnv2("TR_EnableBatchClear", (void *)vm);

I believe Julian's point is: using TLH Batch clearing can make TLH Prefetching less effective significantly. Another words if we zero every TLH memory up front prefetching might be not necessary.

knn-k · 2024-02-13T02:29:44Z

My points are as follows:

Setting OMR_GC_TLH_PREFETCH_FTA in .cmake files is useless and confusing. J9VM_GC_TLH_PREFETCH_FTA overriddes it during the build. PR Clean up GC_TLH_PREFETCH_FTA flag settings in cmake files #18918 fixes it.
Power platforms have different J9VM_GC_TLH_PREFETCH_FTA settings between Linux (ON) and AIX (OFF). Is there any reason for that?
Z platforms have different J9VM_GC_TLH_PREFETCH_FTA settings between Linux (ON) and z/OS (OFF). Is there any reason for that?
gc_tlhPrefetchFTA is disabled in UMA builds for pLinux and zLinux. They are inconsistent with CMake builds for those platforms.

I would like someone to review #18918.

knn-k · 2024-02-15T01:53:56Z

Additional finding:
AIX JIT accesses J9VMThread->tlhPrefetchFTA, but the field is not updated by the GC side because J9VM_GC_TLH_PREFETCH_FTA is disabled on AIX.

p codegen accessing tlhPrefetchFTA: (There are some other locations)

openj9/runtime/compiler/p/codegen/J9TreeEvaluator.cpp

Lines 5918 to 5919 in 1690101

    
           iCursor = generateTrg1MemInstruction(cg,TR::InstOpCode::Op_load, node, temp1Reg, 
        
                 TR::MemoryReference::createWithDisplacement(cg, metaReg, offsetof(J9VMThread, tlhPrefetchFTA), TR::Compiler->om.sizeofReferenceAddress()), iCursor);

GC code updating tlhPrefetchFTA: (There are some other locations)

openj9/runtime/gc_include/ObjectAllocationAPI.hpp

Lines 172 to 174 in 1690101

    
           #if defined(J9VM_GC_TLH_PREFETCH_FTA) 
        
           						currentThread->tlhPrefetchFTA -= allocateSize; 
        
           #endif /* J9VM_GC_TLH_PREFETCH_FTA */

knn-k · 2024-02-25T10:20:24Z

PR #18918 has been merged. J9VM_GC_TLH_PREFETCH_FTA and OMR_GC_TLH_PREFETCH_FTA have consistent values on every platform now.
It is up to Power and Z people to investigate the differences between AIX and pLinux, z/OS and zLinux, respectively.

knn-k added comp:gc comp:jit cmake labels Feb 1, 2024

knn-k mentioned this issue Feb 8, 2024

Clean up GC_TLH_PREFETCH_FTA flag settings in cmake files #18918

Merged

knn-k mentioned this issue Feb 16, 2024

x86: Enable dual TLH by default eclipse-omr/omr#7234

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent cmake settings for TLH_PREFETCH_FTA #18870

Inconsistent cmake settings for TLH_PREFETCH_FTA #18870

knn-k commented Feb 1, 2024

knn-k commented Feb 1, 2024

knn-k commented Feb 1, 2024

knn-k commented Feb 1, 2024

knn-k commented Feb 2, 2024

zl-wang commented Feb 2, 2024

dmitripivkine commented Feb 2, 2024

knn-k commented Feb 2, 2024

knn-k commented Feb 7, 2024

knn-k commented Feb 7, 2024

knn-k commented Feb 7, 2024

knn-k commented Feb 8, 2024

zl-wang commented Feb 8, 2024

knn-k commented Feb 8, 2024

dmitripivkine commented Feb 8, 2024

knn-k commented Feb 9, 2024

knn-k commented Feb 9, 2024

knn-k commented Feb 9, 2024

dmitripivkine commented Feb 9, 2024

knn-k commented Feb 13, 2024

knn-k commented Feb 15, 2024

knn-k commented Feb 25, 2024

Inconsistent cmake settings for TLH_PREFETCH_FTA #18870

Inconsistent cmake settings for TLH_PREFETCH_FTA #18870

Comments

knn-k commented Feb 1, 2024

knn-k commented Feb 1, 2024

knn-k commented Feb 1, 2024

knn-k commented Feb 1, 2024

knn-k commented Feb 2, 2024

zl-wang commented Feb 2, 2024

dmitripivkine commented Feb 2, 2024

knn-k commented Feb 2, 2024

knn-k commented Feb 7, 2024

knn-k commented Feb 7, 2024

knn-k commented Feb 7, 2024

knn-k commented Feb 8, 2024

zl-wang commented Feb 8, 2024

knn-k commented Feb 8, 2024

dmitripivkine commented Feb 8, 2024

knn-k commented Feb 9, 2024

knn-k commented Feb 9, 2024

knn-k commented Feb 9, 2024

dmitripivkine commented Feb 9, 2024

knn-k commented Feb 13, 2024

knn-k commented Feb 15, 2024

knn-k commented Feb 25, 2024