Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling. #81691

Merged
merged 15 commits into from
Feb 22, 2024

Conversation

mingmingl-llvm
Copy link
Contributor

@mingmingl-llvm mingmingl-llvm commented Feb 14, 2024

  • Raw profile format
    • Header: records the byte size of compressed vtable names, and the number of profiled vtable entries (call it VTableProfData). Header also records padded bytes of each section.
    • Payload: adds a section for compressed vtable names, and a section to store VTableProfData. Both sections are padded so the size is a multiple of 8.
  • Indexed profile format
    • Header: records the byte offset of compressed vtable names.
    • Payload: adds a section to store compressed vtable names. This section is used by llvm-profdata to show the list of vtables profiled for an instrumented site.

The originally reviewed patch will have profile reader/write change and llvm-profdata change.

rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600

Copy link

github-actions bot commented Feb 14, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@mingmingl-llvm mingmingl-llvm changed the title [TypeProf]Introduce value type in raw profiles [TypeProf][InstrPGO] Introduce vtable value type in raw and instr profiles. Feb 14, 2024
@mingmingl-llvm mingmingl-llvm changed the title [TypeProf][InstrPGO] Introduce vtable value type in raw and instr profiles. [TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling. Feb 14, 2024
@mingmingl-llvm mingmingl-llvm marked this pull request as ready for review February 14, 2024 23:59
@llvmbot llvmbot added compiler-rt PGO Profile Guided Optimizations llvm:transforms labels Feb 14, 2024
@llvmbot
Copy link
Member

llvmbot commented Feb 14, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-pgo

Author: Mingming Liu (minglotus-6)

Changes
  • Raw profile format
    • Header: records the byte size of compressed vtable names, and the number of profiled vtable entries (call it VTableProfData). Header also records padded bytes of each section.
    • Payload: adds a section for compressed vtable names, and a section to store VTableProfData. Both sections are padded so the size is a multiple of 8.
  • Indexed profile format
    • Header: records the byte offset of compressed vtable names.
    • Payload: adds a section to store compressed vtable names. This section is used by llvm-profdata to show the list of vtables profiled for an instrumented site.

The originally reviewed patch will have profile reader/write change and llvm-profdata change.

rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600


Patch is 69.18 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/81691.diff

31 Files Affected:

  • (modified) compiler-rt/include/profile/InstrProfData.inc (+49-3)
  • (modified) compiler-rt/lib/profile/InstrProfiling.h (+28-7)
  • (modified) compiler-rt/lib/profile/InstrProfilingBuffer.c (+65-15)
  • (modified) compiler-rt/lib/profile/InstrProfilingInternal.h (+6-2)
  • (modified) compiler-rt/lib/profile/InstrProfilingMerge.c (+21-2)
  • (modified) compiler-rt/lib/profile/InstrProfilingPlatformLinux.c (+20)
  • (modified) compiler-rt/lib/profile/InstrProfilingWriter.c (+29-8)
  • (modified) compiler-rt/test/profile/instrprof-write-buffer-internal.c (+4-2)
  • (modified) llvm/include/llvm/ProfileData/InstrProf.h (+14-3)
  • (modified) llvm/include/llvm/ProfileData/InstrProfData.inc (+32-6)
  • (modified) llvm/include/llvm/ProfileData/InstrProfReader.h (+13)
  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+9-2)
  • (modified) llvm/lib/ProfileData/InstrProfReader.cpp (+38-2)
  • (modified) llvm/lib/ProfileData/InstrProfWriter.cpp (+40-5)
  • (modified) llvm/test/Instrumentation/InstrProfiling/coverage.ll (+4-4)
  • (modified) llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.profraw ()
  • (modified) llvm/test/Transforms/PGOProfile/comdat_internal.ll (+2-2)
  • (modified) llvm/test/tools/llvm-profdata/Inputs/c-general.profraw ()
  • (added) llvm/test/tools/llvm-profdata/Inputs/thinlto_indirect_call_promotion.profraw ()
  • (modified) llvm/test/tools/llvm-profdata/binary-ids-padding.test (+5-1)
  • (modified) llvm/test/tools/llvm-profdata/large-binary-id-size.test (+3-1)
  • (modified) llvm/test/tools/llvm-profdata/malformed-not-space-for-another-header.test (+5-1)
  • (modified) llvm/test/tools/llvm-profdata/malformed-num-counters-zero.test (+5-1)
  • (modified) llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test (+5-1)
  • (modified) llvm/test/tools/llvm-profdata/misaligned-binary-ids-size.test (+3-1)
  • (modified) llvm/test/tools/llvm-profdata/mismatched-raw-profile-header.test (+2)
  • (modified) llvm/test/tools/llvm-profdata/raw-32-bits-be.test (+6-5)
  • (modified) llvm/test/tools/llvm-profdata/raw-32-bits-le.test (+5-5)
  • (modified) llvm/test/tools/llvm-profdata/raw-64-bits-be.test (+5-5)
  • (modified) llvm/test/tools/llvm-profdata/raw-64-bits-le.test (+5-5)
  • (modified) llvm/test/tools/llvm-profdata/raw-two-profiles.test (+6-2)
diff --git a/compiler-rt/include/profile/InstrProfData.inc b/compiler-rt/include/profile/InstrProfData.inc
index 25df899b3f3619..fbf17b2000e758 100644
--- a/compiler-rt/include/profile/InstrProfData.inc
+++ b/compiler-rt/include/profile/InstrProfData.inc
@@ -94,6 +94,25 @@ INSTR_PROF_DATA(const uint32_t, llvm::Type::getInt32Ty(Ctx), NumBitmapBytes, \
 #undef INSTR_PROF_DATA
 /* INSTR_PROF_DATA end. */
 
+/* For a virtual table object, record the name hash to associate profiled
+ * addresses with global variables, and record {starting address, size in bytes}
+ * to map the profiled virtual table (which usually have an offset from the
+ * starting address) back to a virtual table object. */
+#ifndef INSTR_PROF_VTABLE_DATA
+#define INSTR_PROF_VTABLE_DATA(Type, LLVMType, Name, Initializer)
+#else
+#define INSTR_PROF_VTABLE_DATA_DEFINED
+#endif
+INSTR_PROF_VTABLE_DATA(const uint64_t, llvm::Type::getInt64Ty(Ctx), \
+                       VTableNameHash, ConstantInt::get(llvm::Type::getInt64Ty(Ctx), \
+                     IndexedInstrProf::ComputeHash(PGOVTableName)))
+INSTR_PROF_VTABLE_DATA(const IntPtrT, llvm::Type::getInt8PtrTy(Ctx), \
+                       VTablePointer, VTableAddr)
+INSTR_PROF_VTABLE_DATA(const uint32_t, llvm::Type::getInt32Ty(Ctx), VTableSize, \
+                       ConstantInt::get(llvm::Type::getInt32Ty(Ctx), \
+                                        VTableSizeVal))
+#undef INSTR_PROF_VTABLE_DATA
+/* INSTR_PROF_VTABLE_DATA end. */
 
 /* This is an internal data structure used by value profiler. It
  * is defined here to allow serialization code sharing by LLVM
@@ -145,6 +164,8 @@ INSTR_PROF_RAW_HEADER(uint64_t, CountersDelta,
 INSTR_PROF_RAW_HEADER(uint64_t, BitmapDelta,
                       (uintptr_t)BitmapBegin - (uintptr_t)DataBegin)
 INSTR_PROF_RAW_HEADER(uint64_t, NamesDelta, (uintptr_t)NamesBegin)
+INSTR_PROF_RAW_HEADER(uint64_t, VNamesSize, VNamesSize)
+INSTR_PROF_RAW_HEADER(uint64_t, NumVTables, NumVTables)
 INSTR_PROF_RAW_HEADER(uint64_t, ValueKindLast, IPVK_Last)
 #undef INSTR_PROF_RAW_HEADER
 /* INSTR_PROF_RAW_HEADER  end */
@@ -186,13 +207,28 @@ VALUE_PROF_FUNC_PARAM(uint32_t, CounterIndex, Type::getInt32Ty(Ctx))
 VALUE_PROF_KIND(IPVK_IndirectCallTarget, 0, "indirect call target")
 /* For memory intrinsic functions size profiling. */
 VALUE_PROF_KIND(IPVK_MemOPSize, 1, "memory intrinsic functions size")
+/* For virtual table address profiling, the address point of the virtual table
+ * (i.e., the address contained in objects pointing to a virtual table) are
+ * profiled. Note this may not be the address of the per C++ class virtual table
+ *  object (e.g., there might be an offset).
+ *
+ * The profiled addresses are stored in raw profile, together with the following
+ * two types of information.
+ * 1. The (starting and ending) addresses of per C++ class virtual table objects.
+ * 2. The (compressed) virtual table object names.
+ * RawInstrProfReader converts profiled virtual table addresses to virtual table
+ *  objects' MD5 hash.
+ */
+VALUE_PROF_KIND(IPVK_VTableTarget, 2, "The address of the compatible vtable (i.e., "
+                                      "there is an offset from this address to per C++ "
+                                      "class virtual table global variable.)")
 /* These two kinds must be the last to be
  * declared. This is to make sure the string
  * array created with the template can be
  * indexed with the kind value.
  */
 VALUE_PROF_KIND(IPVK_First, IPVK_IndirectCallTarget, "first")
-VALUE_PROF_KIND(IPVK_Last, IPVK_MemOPSize, "last")
+VALUE_PROF_KIND(IPVK_Last, IPVK_VTableTarget, "last")
 
 #undef VALUE_PROF_KIND
 /* VALUE_PROF_KIND end */
@@ -282,12 +318,18 @@ INSTR_PROF_SECT_ENTRY(IPSK_bitmap, \
 INSTR_PROF_SECT_ENTRY(IPSK_name, \
                       INSTR_PROF_QUOTE(INSTR_PROF_NAME_COMMON), \
                       INSTR_PROF_NAME_COFF, "__DATA,")
+INSTR_PROF_SECT_ENTRY(IPSK_vname, \
+                      INSTR_PROF_QUOTE(INSTR_PROF_VNAME_COMMON), \
+                      INSTR_PROF_VNAME_COFF, "__DATA,")
 INSTR_PROF_SECT_ENTRY(IPSK_vals, \
                       INSTR_PROF_QUOTE(INSTR_PROF_VALS_COMMON), \
                       INSTR_PROF_VALS_COFF, "__DATA,")
 INSTR_PROF_SECT_ENTRY(IPSK_vnodes, \
                       INSTR_PROF_QUOTE(INSTR_PROF_VNODES_COMMON), \
                       INSTR_PROF_VNODES_COFF, "__DATA,")
+INSTR_PROF_SECT_ENTRY(IPSK_vtab, \
+                      INSTR_PROF_QUOTE(INSTR_PROF_VTAB_COMMON), \
+                      INSTR_PROF_VTAB_COFF, "__DATA,")
 INSTR_PROF_SECT_ENTRY(IPSK_covmap, \
                       INSTR_PROF_QUOTE(INSTR_PROF_COVMAP_COMMON), \
                       INSTR_PROF_COVMAP_COFF, "__LLVM_COV,")
@@ -663,9 +705,9 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
         (uint64_t)'f' << 16 | (uint64_t)'R' << 8 | (uint64_t)129
 
 /* Raw profile format version (start from 1). */
-#define INSTR_PROF_RAW_VERSION 9
+#define INSTR_PROF_RAW_VERSION 10
 /* Indexed profile format version (start from 1). */
-#define INSTR_PROF_INDEX_VERSION 11
+#define INSTR_PROF_INDEX_VERSION 12
 /* Coverage mapping format version (start from 0). */
 #define INSTR_PROF_COVMAP_VERSION 6
 
@@ -703,10 +745,12 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
    than WIN32 */
 #define INSTR_PROF_DATA_COMMON __llvm_prf_data
 #define INSTR_PROF_NAME_COMMON __llvm_prf_names
+#define INSTR_PROF_VNAME_COMMON __llvm_prf_vtabnames
 #define INSTR_PROF_CNTS_COMMON __llvm_prf_cnts
 #define INSTR_PROF_BITS_COMMON __llvm_prf_bits
 #define INSTR_PROF_VALS_COMMON __llvm_prf_vals
 #define INSTR_PROF_VNODES_COMMON __llvm_prf_vnds
+#define INSTR_PROF_VTAB_COMMON __llvm_prf_vtab
 #define INSTR_PROF_COVMAP_COMMON __llvm_covmap
 #define INSTR_PROF_COVFUN_COMMON __llvm_covfun
 #define INSTR_PROF_COVDATA_COMMON __llvm_covdata
@@ -717,10 +761,12 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
  */
 #define INSTR_PROF_DATA_COFF ".lprfd$M"
 #define INSTR_PROF_NAME_COFF ".lprfn$M"
+#define INSTR_PROF_VNAME_COFF ".lprfvn$M"
 #define INSTR_PROF_CNTS_COFF ".lprfc$M"
 #define INSTR_PROF_BITS_COFF ".lprfb$M"
 #define INSTR_PROF_VALS_COFF ".lprfv$M"
 #define INSTR_PROF_VNODES_COFF ".lprfnd$M"
+#define INSTR_PROF_VTAB_COFF ".lprfvt$M"
 #define INSTR_PROF_COVMAP_COFF ".lcovmap$M"
 #define INSTR_PROF_COVFUN_COFF ".lcovfun$M"
 /* Since cov data and cov names sections are not allocated, we don't need to
diff --git a/compiler-rt/lib/profile/InstrProfiling.h b/compiler-rt/lib/profile/InstrProfiling.h
index 01239083369187..be694a8d3330ba 100644
--- a/compiler-rt/lib/profile/InstrProfiling.h
+++ b/compiler-rt/lib/profile/InstrProfiling.h
@@ -49,6 +49,12 @@ typedef struct ValueProfNode {
 #include "profile/InstrProfData.inc"
 } ValueProfNode;
 
+typedef void *IntPtrT;
+typedef struct COMPILER_RT_ALIGNAS(INSTR_PROF_DATA_ALIGNMENT) VTableProfData {
+#define INSTR_PROF_VTABLE_DATA(Type, LLVMType, Name, Initializer) Type Name;
+#include "profile/InstrProfData.inc"
+} VTableProfData;
+
 /*!
  * \brief Return 1 if profile counters are continuously synced to the raw
  * profile via an mmap(). This is in contrast to the default mode, in which
@@ -103,12 +109,16 @@ const __llvm_profile_data *__llvm_profile_begin_data(void);
 const __llvm_profile_data *__llvm_profile_end_data(void);
 const char *__llvm_profile_begin_names(void);
 const char *__llvm_profile_end_names(void);
+const char *__llvm_profile_begin_vtabnames(void);
+const char *__llvm_profile_end_vtabnames(void);
 char *__llvm_profile_begin_counters(void);
 char *__llvm_profile_end_counters(void);
 char *__llvm_profile_begin_bitmap(void);
 char *__llvm_profile_end_bitmap(void);
 ValueProfNode *__llvm_profile_begin_vnodes();
 ValueProfNode *__llvm_profile_end_vnodes();
+VTableProfData *__llvm_profile_begin_vtables();
+VTableProfData *__llvm_profile_end_vtables();
 uint32_t *__llvm_profile_begin_orderfile();
 
 /*!
@@ -252,20 +262,31 @@ uint64_t __llvm_profile_get_num_bitmap_bytes(const char *Begin,
 /*! \brief Get the size of the profile name section in bytes. */
 uint64_t __llvm_profile_get_name_size(const char *Begin, const char *End);
 
-/* ! \brief Given the sizes of the data and counter information, return the
- * number of padding bytes before and after the counters, and after the names,
- * in the raw profile.
+/*! \brief Get the number of virtual table profile data entries */
+uint64_t __llvm_profile_get_num_vtable(const VTableProfData *Begin,
+                                       const VTableProfData *End);
+
+/*! \brief Get the size of virtual table profile data in bytes. */
+uint64_t __llvm_profile_get_vtable_section_size(const VTableProfData *Begin,
+                                                const VTableProfData *End);
+
+/* ! \brief Given the sizes of the data and counter information, computes the
+ * number of padding bytes before and after the counter section, as well as the
+ * number of padding bytes after other setions in the raw profile.
+ * Returns -1 upon errors and 0 upon success. Output parameters should be used
+ * iff return value is 0.
  *
  * Note: When mmap() mode is disabled, no padding bytes before/after counters
  * are needed. However, in mmap() mode, the counter section in the raw profile
  * must be page-aligned: this API computes the number of padding bytes
  * needed to achieve that.
  */
-void __llvm_profile_get_padding_sizes_for_counters(
+int __llvm_profile_get_padding_sizes_for_counters(
     uint64_t DataSize, uint64_t CountersSize, uint64_t NumBitmapBytes,
-    uint64_t NamesSize, uint64_t *PaddingBytesBeforeCounters,
-    uint64_t *PaddingBytesAfterCounters, uint64_t *PaddingBytesAfterBitmap,
-    uint64_t *PaddingBytesAfterNames);
+    uint64_t NamesSize, uint64_t VTableSize, uint64_t VNameSize,
+    uint64_t *PaddingBytesBeforeCounters, uint64_t *PaddingBytesAfterCounters,
+    uint64_t *PaddingBytesAfterBitmap, uint64_t *PaddingBytesAfterNames,
+    uint64_t *PaddingBytesAfterVTable, uint64_t *PaddingBytesAfterVNames);
 
 /*!
  * \brief Set the flag that profile data has been dumped to the file.
diff --git a/compiler-rt/lib/profile/InstrProfilingBuffer.c b/compiler-rt/lib/profile/InstrProfilingBuffer.c
index af52804b2b532c..a38d0bb96f11ec 100644
--- a/compiler-rt/lib/profile/InstrProfilingBuffer.c
+++ b/compiler-rt/lib/profile/InstrProfilingBuffer.c
@@ -51,10 +51,14 @@ uint64_t __llvm_profile_get_size_for_buffer(void) {
   const char *BitmapEnd = __llvm_profile_end_bitmap();
   const char *NamesBegin = __llvm_profile_begin_names();
   const char *NamesEnd = __llvm_profile_end_names();
+  const VTableProfData *VTableBegin = __llvm_profile_begin_vtables();
+  const VTableProfData *VTableEnd = __llvm_profile_end_vtables();
+  const char *VNamesBegin = __llvm_profile_begin_vtabnames();
+  const char *VNamesEnd = __llvm_profile_end_vtabnames();
 
   return __llvm_profile_get_size_for_buffer_internal(
       DataBegin, DataEnd, CountersBegin, CountersEnd, BitmapBegin, BitmapEnd,
-      NamesBegin, NamesEnd);
+      NamesBegin, NamesEnd, VTableBegin, VTableEnd, VNamesBegin, VNamesEnd);
 }
 
 COMPILER_RT_VISIBILITY
@@ -71,6 +75,19 @@ uint64_t __llvm_profile_get_data_size(const __llvm_profile_data *Begin,
   return __llvm_profile_get_num_data(Begin, End) * sizeof(__llvm_profile_data);
 }
 
+COMPILER_RT_VISIBILITY
+uint64_t __llvm_profile_get_num_vtable(const VTableProfData *Begin,
+                                       const VTableProfData *End) {
+  intptr_t EndI = (intptr_t)End, BeginI = (intptr_t)Begin;
+  return (EndI + sizeof(VTableProfData) - 1 - BeginI) / sizeof(VTableProfData);
+}
+
+COMPILER_RT_VISIBILITY
+uint64_t __llvm_profile_get_vtable_section_size(const VTableProfData *Begin,
+                                                const VTableProfData *End) {
+  return __llvm_profile_get_num_vtable(Begin, End) * sizeof(VTableProfData);
+}
+
 COMPILER_RT_VISIBILITY size_t __llvm_profile_counter_entry_size(void) {
   if (__llvm_profile_get_version() & VARIANT_MASK_BYTE_COVERAGE)
     return sizeof(uint8_t);
@@ -119,11 +136,13 @@ static int needsCounterPadding(void) {
 }
 
 COMPILER_RT_VISIBILITY
-void __llvm_profile_get_padding_sizes_for_counters(
+int __llvm_profile_get_padding_sizes_for_counters(
     uint64_t DataSize, uint64_t CountersSize, uint64_t NumBitmapBytes,
-    uint64_t NamesSize, uint64_t *PaddingBytesBeforeCounters,
-    uint64_t *PaddingBytesAfterCounters, uint64_t *PaddingBytesAfterBitmapBytes,
-    uint64_t *PaddingBytesAfterNames) {
+    uint64_t NamesSize, uint64_t VTableSize, uint64_t VNameSize,
+    uint64_t *PaddingBytesBeforeCounters, uint64_t *PaddingBytesAfterCounters,
+    uint64_t *PaddingBytesAfterBitmapBytes, uint64_t *PaddingBytesAfterNames,
+    uint64_t *PaddingBytesAfterVTable, uint64_t *PaddingBytesAfterVName) {
+  // Counter padding is needed only if continuous mode is enabled.
   if (!needsCounterPadding()) {
     *PaddingBytesBeforeCounters = 0;
     *PaddingBytesAfterCounters =
@@ -131,9 +150,19 @@ void __llvm_profile_get_padding_sizes_for_counters(
     *PaddingBytesAfterBitmapBytes =
         __llvm_profile_get_num_padding_bytes(NumBitmapBytes);
     *PaddingBytesAfterNames = __llvm_profile_get_num_padding_bytes(NamesSize);
-    return;
+    if (PaddingBytesAfterVTable != NULL)
+      *PaddingBytesAfterVTable =
+          __llvm_profile_get_num_padding_bytes(VTableSize);
+    if (PaddingBytesAfterVName != NULL)
+      *PaddingBytesAfterVName = __llvm_profile_get_num_padding_bytes(VNameSize);
+    return 0;
   }
 
+  // Value profiling not supported in continuous mode at profile-write time.
+  // Return -1 to alert the incompatibility.
+  if (VTableSize != 0 || VNameSize != 0)
+    return -1;
+
   // In continuous mode, the file offsets for headers and for the start of
   // counter sections need to be page-aligned.
   *PaddingBytesBeforeCounters =
@@ -142,13 +171,22 @@ void __llvm_profile_get_padding_sizes_for_counters(
   *PaddingBytesAfterBitmapBytes =
       calculateBytesNeededToPageAlign(NumBitmapBytes);
   *PaddingBytesAfterNames = calculateBytesNeededToPageAlign(NamesSize);
+  // Set these two variables to zero to avoid uninitialized variables
+  // even if VTableSize and VNameSize are known to be zero.
+  if (PaddingBytesAfterVTable != NULL)
+    *PaddingBytesAfterVTable = 0;
+  if (PaddingBytesAfterVName != NULL)
+    *PaddingBytesAfterVName = 0;
+  return 0;
 }
 
 COMPILER_RT_VISIBILITY
 uint64_t __llvm_profile_get_size_for_buffer_internal(
     const __llvm_profile_data *DataBegin, const __llvm_profile_data *DataEnd,
     const char *CountersBegin, const char *CountersEnd, const char *BitmapBegin,
-    const char *BitmapEnd, const char *NamesBegin, const char *NamesEnd) {
+    const char *BitmapEnd, const char *NamesBegin, const char *NamesEnd,
+    const VTableProfData *VTableBegin, const VTableProfData *VTableEnd,
+    const char *VNamesBegin, const char *VNamesEnd) {
   /* Match logic in __llvm_profile_write_buffer(). */
   const uint64_t NamesSize = (NamesEnd - NamesBegin) * sizeof(char);
   uint64_t DataSize = __llvm_profile_get_data_size(DataBegin, DataEnd);
@@ -156,20 +194,29 @@ uint64_t __llvm_profile_get_size_for_buffer_internal(
       __llvm_profile_get_counters_size(CountersBegin, CountersEnd);
   const uint64_t NumBitmapBytes =
       __llvm_profile_get_num_bitmap_bytes(BitmapBegin, BitmapEnd);
+  const uint64_t VTableSize =
+      __llvm_profile_get_vtable_section_size(VTableBegin, VTableEnd);
+  const uint64_t VNameSize =
+      __llvm_profile_get_name_size(VNamesBegin, VNamesEnd);
 
   /* Determine how much padding is needed before/after the counters and after
    * the names. */
   uint64_t PaddingBytesBeforeCounters, PaddingBytesAfterCounters,
-      PaddingBytesAfterNames, PaddingBytesAfterBitmapBytes;
+      PaddingBytesAfterNames, PaddingBytesAfterBitmapBytes,
+      PaddingBytesAfterVTable, PaddingBytesAfterVNames;
   __llvm_profile_get_padding_sizes_for_counters(
-      DataSize, CountersSize, NumBitmapBytes, NamesSize,
-      &PaddingBytesBeforeCounters, &PaddingBytesAfterCounters,
-      &PaddingBytesAfterBitmapBytes, &PaddingBytesAfterNames);
+      DataSize, CountersSize, NumBitmapBytes, NamesSize, 0 /* VTableSize */,
+      0 /* VNameSize */, &PaddingBytesBeforeCounters,
+      &PaddingBytesAfterCounters, &PaddingBytesAfterBitmapBytes,
+      &PaddingBytesAfterNames, &PaddingBytesAfterVTable,
+      &PaddingBytesAfterVNames);
 
   return sizeof(__llvm_profile_header) + __llvm_write_binary_ids(NULL) +
          DataSize + PaddingBytesBeforeCounters + CountersSize +
          PaddingBytesAfterCounters + NumBitmapBytes +
-         PaddingBytesAfterBitmapBytes + NamesSize + PaddingBytesAfterNames;
+         PaddingBytesAfterBitmapBytes + NamesSize + PaddingBytesAfterNames +
+         VTableSize + PaddingBytesAfterVTable + VNameSize +
+         PaddingBytesAfterVNames;
 }
 
 COMPILER_RT_VISIBILITY
@@ -191,7 +238,10 @@ COMPILER_RT_VISIBILITY int __llvm_profile_write_buffer_internal(
     const char *NamesBegin, const char *NamesEnd) {
   ProfDataWriter BufferWriter;
   initBufferWriter(&BufferWriter, Buffer);
-  return lprofWriteDataImpl(&BufferWriter, DataBegin, DataEnd, CountersBegin,
-                            CountersEnd, BitmapBegin, BitmapEnd, 0, NamesBegin,
-                            NamesEnd, 0);
+  // Set virtual table arguments to NULL since they are not supported yet.
+  return lprofWriteDataImpl(
+      &BufferWriter, DataBegin, DataEnd, CountersBegin, CountersEnd,
+      BitmapBegin, BitmapEnd, 0 /* VPDataReader */, NamesBegin, NamesEnd,
+      NULL /* VTableBegin */, NULL /* VTableEnd */, NULL /* VNamesBegin */,
+      NULL /* VNamesEnd */, 0 /* SkipNameDataWrite */);
 }
diff --git a/compiler-rt/lib/profile/InstrProfilingInternal.h b/compiler-rt/lib/profile/InstrProfilingInternal.h
index 03ed67fcfa766f..d5bd0e41fb1291 100644
--- a/compiler-rt/lib/profile/InstrProfilingInternal.h
+++ b/compiler-rt/lib/profile/InstrProfilingInternal.h
@@ -22,7 +22,9 @@
 uint64_t __llvm_profile_get_size_for_buffer_internal(
     const __llvm_profile_data *DataBegin, const __llvm_profile_data *DataEnd,
     const char *CountersBegin, const char *CountersEnd, const char *BitmapBegin,
-    const char *BitmapEnd, const char *NamesBegin, const char *NamesEnd);
+    const char *BitmapEnd, const char *NamesBegin, const char *NamesEnd,
+    const VTableProfData *VTableBegin, const VTableProfData *VTableEnd,
+    const char *VNamesBegin, const char *VNamesEnd);
 
 /*!
  * \brief Write instrumentation data to the given buffer, given explicit
@@ -156,7 +158,9 @@ int lprofWriteDataImpl(ProfDataWriter *Writer,
                        const char *CountersBegin, const char *CountersEnd,
                        const char *BitmapBegin, const char *BitmapEnd,
                        VPDataReaderType *VPDataReader, const char *NamesBegin,
-                       const char *NamesEnd, int SkipNameDataWrite);
+                       const char *NamesEnd, const VTableProfData *VTableBegin,
+                       const VTableProfData *VTableEnd, const char *VNamesBegin,
+                       const char *VNamesEnd, int SkipNameDataWrite);
 
 /* Merge value profile data pointed to by SrcValueProfData into
  * in-memory profile counters pointed by to DstData.  */
diff --git a/compiler-rt/lib/profile/InstrProfilingMerge.c b/compiler-rt/lib/profile/InstrProfilingMerge.c
index b5850e99ee37d8..c6e03db7009a8c 100644
--- a/compiler-rt/lib/profile/InstrProfilingMerge.c
+++ b/compiler-rt/lib/profile/InstrProfilingMerge.c
@@ -107,6 +107,26 @@ static uintptr_t signextIfWin64(void *V) {
 #endif
 }
 
+static uint64_t
+getDistanceFromCounterToValueProf(const __llvm_profile_header *const Header) {
+  // Skip names section, vtable profile data section and vtable names section
+  // for runtime profile merge. To merge runtime addresses from multiple
+  // profiles collected from the same instrumented binary, the binary should be
+  // loaded at fixed base address (e.g., build with -no-pie, or run with ASLR
+  // disabled). In this set-up these three sections remain unchanged.
+  const uint64_t VTableSectionSize =
+      Header->NumVTables * sizeof(VTableProfData);
+  const uint64_t PaddingBytesAfterVTableSection =
+      __llvm_profile_get_num_padding_bytes(VTableSectionSize);
+  const uint64_t VNamesSize = Header->VNamesSize;
+  const uint64_t PaddingBytesAfterVNamesSize =
+      __llvm_profile_get_num_padding_bytes(VNamesSize);
+  return Header->Nam...
[truncated]

@mingmingl-llvm
Copy link
Contributor Author

I have to ask for another round of review , after the original PR (#66825) received many many feedbacks. Thanks all for the time again!

#endif
INSTR_PROF_VTABLE_DATA(const uint64_t, llvm::Type::getInt64Ty(Ctx), \
VTableNameHash, ConstantInt::get(llvm::Type::getInt64Ty(Ctx), \
IndexedInstrProf::ComputeHash(PGOVTableName)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@@ -107,6 +107,26 @@ static uintptr_t signextIfWin64(void *V) {
#endif
}

static uint64_t
getDistanceFromCounterToValueProf(const __llvm_profile_header *const Header) {
// Skip names section, vtable profile data section and vtable names section
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move comments outside function body.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

for (int I = 0; I < N - 4; I++)
// 'BinaryIdOffset', `TemporalProfTracesOffset` and `VTableNamesOffset`. We
// need to remember the offset of these fields to allow back patching later.
for (int I = 0; I < N - 5; I++)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define a macro "#define SKIPPED_FIELDS 5"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to write out the first four fields and updated comment.

@mingmingl-llvm
Copy link
Contributor Author

thanks for the review! I'm now communicating with the stakeholders for coordination. Once there is a plan (hopefully as late as mid next week), I will merge this change.

@modiking
Copy link
Contributor

Also the CI clang-format is complaining so make sure the re-run that on your changes

@modiking
Copy link
Contributor

Also the CI clang-format is complaining so make sure the re-run that on your changes

clang-format formats existing code (specifically C++ macros) in InstrProfData.inc pretty aggressively and the indentations making code hard to read.

Taking https://github.com/llvm/llvm-project/actions/runs/7921397694/job/21626608711?pr=81691 as an example, reformatted code between -/+ line 210 to line 267 are hard to read, same for code between line 284 to line 404, code between 780 to 970, etc

Given clang-format only complains about two InstrProfData.inc, I'm inclined not to run git clang-format to update these two files.

If that's the case we could ignore these files from clang-format via a .clang-format-ignore file if you're inclined to make that change.

Copy link
Contributor

@modiking modiking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mingmingl-llvm
Copy link
Contributor Author

Also the CI clang-format is complaining so make sure the re-run that on your changes

clang-format formats existing code (specifically C++ macros) in InstrProfData.inc pretty aggressively and the indentations making code hard to read.
Taking https://github.com/llvm/llvm-project/actions/runs/7921397694/job/21626608711?pr=81691 as an example, reformatted code between -/+ line 210 to line 267 are hard to read, same for code between line 284 to line 404, code between 780 to 970, etc
Given clang-format only complains about two InstrProfData.inc, I'm inclined not to run git clang-format to update these two files.

If that's the case we could ignore these files from clang-format via a .clang-format-ignore file if you're inclined to make that change.

Thanks for the suggestions! After taking a look, .clang-format-ignore isn't used in current repo while // clang-format off:<reason> is used (e.g. at 7 different places in this file). I could make possibly add a few // clang-format off and // clang-format on around c++ macros in an nfc pull request.

Meanwhile, I'll send out a follow up change to update https://llvm.org/docs/InstrProfileFormat.html

INSTR_PROF_VTABLE_DATA(const uint64_t, llvm::Type::getInt64Ty(Ctx), \
VTableNameHash, ConstantInt::get(llvm::Type::getInt64Ty(Ctx), \
IndexedInstrProf::ComputeHash(PGOVTableName)))
INSTR_PROF_VTABLE_DATA(const IntPtrT, llvm::Type::getInt8PtrTy(Ctx), \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diffing the contents here with the file in llvm/ProfileData/ showed me that the latter uses llvm::PointerType::getUnqual(Ctx). Can you sync them up and make the diff the files to ensure there are no inconsistencies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. Fixed all the inconsistencies between two 'InstrProfData.inc`files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also sent out an nfc to clean up the use of getInt8PtrTy

uint64_t __llvm_profile_get_num_vtable(const VTableProfData *Begin,
const VTableProfData *End) {
intptr_t EndI = (intptr_t)End, BeginI = (intptr_t)Begin;
return (EndI + sizeof(VTableProfData) - 1 - BeginI) / sizeof(VTableProfData);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the math for the numerator. Can you add a comment to explain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I implemented __llvm_profile_get_num_vtable based on __llvm_profile_get_num_data, and the question is around whether/why sizeof(C struct) - 1 is necessary in the numerator.

From offline chat, the conclusion (based on reverse reasoning) is that sizeof(C struct) - 1 is necessary iff [Begin, End] is an inclusive range.

  • For example, the function is expected to return 1 but would return 0 if caller passes <Addr, Addr + sizeof(C-struct) -1> as the pair of <Begin, End>

From inspecting the elf sections of __llvm_prof_cnts in a simple instrumented hello-world program and looking at pointer values in a debugger, End is one byte past the inclusive range (i.e. caller passes [Begin, End) as function argument).

So it makes no difference (at least for elf) if we remove sizeof(C struct) - 1 from the numerator. I added a comment in __llvm_profile_get_num_data to record the conclusion (but keep existing code as it is) and removed sizeof()-1 in the new code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or rather, I wonder if I shall make __llvm_profile_get_num_vtable and __llvm_profile_get_num_data consistent (both should use sizeof(c struct) - 1 or both should not use them).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think new code should not include it just because it was there before. If you update the old code and it breaks something post-commit then reverting this patch will be painful. If you want to keep things consistent, I would split out the change to the old code as a small patch and submit it soon (and monitor for any breakage - though I believe it is unlikely to happen).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. I'll send out separate patches to update existing code (__llvm_profile_get_num_data). Before that I might dig and validate whether non-elf platforms also guarantee caller passes [Begin, End) (not [Begin, End]).

COMPILER_RT_VISIBILITY
uint64_t __llvm_profile_get_vtable_section_size(const VTableProfData *Begin,
const VTableProfData *End) {
return __llvm_profile_get_num_vtable(Begin, End) * sizeof(VTableProfData);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be return End - Begin; now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated both __llvm_profile_get_num_vtable and __llvm_profile_get_vtable_section_size to get rid of sizeof(C-struct).

https://gcc.godbolt.org/z/MWaG6cEPn shows the generated codes before and after are equivalent (as long as ((intptr_t)End-(intptr_t)Begin) % sizeof(C-struct)==0 is true)

uint64_t __llvm_profile_get_num_vtable(const VTableProfData *Begin,
const VTableProfData *End) {
intptr_t EndI = (intptr_t)End, BeginI = (intptr_t)Begin;
return (EndI + sizeof(VTableProfData) - 1 - BeginI) / sizeof(VTableProfData);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think new code should not include it just because it was there before. If you update the old code and it breaks something post-commit then reverting this patch will be painful. If you want to keep things consistent, I would split out the change to the old code as a small patch and submit it soon (and monitor for any breakage - though I believe it is unlikely to happen).

mingmingl-llvm added a commit that referenced this pull request Feb 21, 2024
…matting on the rest (#82057)

Without this, each time `InstrProfData.inc` is modified (like in
#81691), pre-commit CI
clang-format aggressively formats many lines in an unreadable way. Pull
request with red pre-commit checks are usually frowned upon.

* Use `// clang-format:<reason>` instead of `/* clang-format */`. The
former
[allows](https://github.com/llvm/llvm-project/blob/563ef306017a47d387f1c36dd562b172c1ad0626/clang/lib/Format/Format.cpp#L4108-L4113)
specifying a reason but the latter is
[not](https://github.com/llvm/llvm-project/blob/563ef306017a47d387f1c36dd562b172c1ad0626/clang/lib/Format/Format.cpp#L4105-L4106).
- Filed #82426 to track the
issue in clang-format.
Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

profile version is outdated.
- update llvm/test/tools/llvm-profdata/Inputs/compressed.profraw along with raw profile version bump
- The test requires '!zlib', so I generate raw profile with DLLVM_ENABLE_ZLIB=1 and run the
  test with DLLVM_ENABLE_ZLIB=0.
@mingmingl-llvm
Copy link
Contributor Author

Going to run another 'git merge main' as a way to re-run pre-commit CI, given the current Windows failure doesn't seem relevant.

@mingmingl-llvm mingmingl-llvm merged commit db7e9e6 into main Feb 22, 2024
4 checks passed
@mingmingl-llvm mingmingl-llvm deleted the users/minglotus-6/typeprofrawformat branch February 22, 2024 04:59
mingmingl-llvm added a commit that referenced this pull request Feb 22, 2024
* `N` became unused after [pull request 81691](#81691)
* This should fix the build bot failure of `unused variable`
https://lab.llvm.org/buildbot/#/builders/77/builds/34840
@mingmingl-llvm mingmingl-llvm restored the users/minglotus-6/typeprofrawformat branch February 22, 2024 05:33
mingmingl-llvm added a commit that referenced this pull request Feb 22, 2024
@mingmingl-llvm
Copy link
Contributor Author

mingmingl-llvm commented Feb 22, 2024

Hi,
I reverted the change yesterday as the build of profiling runtime broke for non-ELF cases. I restored the original branch with all commit histories and appended one commit on top of that. Now waiting for Windows machine availability for a real test (MacOS test passed). Will send out a PR for review to reland the change after testing the change on Windows.

@mingmingl-llvm
Copy link
Contributor Author

Hi, I reverted the change yesterday as the build of profiling runtime broke for non-ELF cases. I restored the original branch with all commit histories and appended one commit on top of that. Now waiting for Windows machine availability for a real test (MacOS test passed). Will send out a PR for review to reland the change after testing the change on Windows.

Actually I realized only Windows linker magic needs real machine for testing. To pass the build, the functions could return NULL. Even on ELF, those pointers are NULL (i.e., no lowering of data in the profile format change).

So sent out #82711

mingmingl-llvm added a commit that referenced this pull request Feb 27, 2024
…hange for type profiling." (#82711)

New change on top of [reviewed
patch](#81691) are [in commits
after this
one](d0757f4).
Previous commits are restored from the remote branch with timestamps.

1. Fix build breakage for non-ELF platforms, by defining the missing
functions {`__llvm_profile_begin_vtables`, `__llvm_profile_end_vtables`,
`__llvm_profile_begin_vtabnames `, `__llvm_profile_end_vtabnames`}
everywhere.
* Tested on mac laptop (for darwins) and Windows. Specifically,
functions in `InstrProfilingPlatformWindows.c` returns `NULL` to make it
more explicit that type prof isn't supported; see comments for the
reason.
* For the rest (AIX, other), mostly follow existing examples (like this
[one](f95b2f1))
   
2. Rename `__llvm_prf_vtabnames` -> `__llvm_prf_vns` for shorter section
name, and make returned pointers
[const](a825d2a#diff-4de780ce726d76b7abc9d3353aef95013e7b21e7bda01be8940cc6574fb0b5ffR120-R121)

**Original Description**

* Raw profile format
- Header: records the byte size of compressed vtable names, and the
number of profiled vtable entries (call it `VTableProfData`). Header
also records padded bytes of each section.
- Payload: adds a section for compressed vtable names, and a section to
store `VTableProfData`. Both sections are padded so the size is a
multiple of 8.
* Indexed profile format
  - Header: records the byte offset of compressed vtable names.
- Payload: adds a section to store compressed vtable names. This section
is used by `llvm-profdata` to show the list of vtables profiled for an
instrumented site.
  
[The originally reviewed
patch](#66825) will have
profile reader/write change and llvm-profdata change.
- To ensure this PR has all the necessary profile format change along
with profile version bump, created a copy of the originally reviewed
patch in #80761. The copy
doesn't have profile format change, but it has the set of tests which
covers type profile generation, profile read and profile merge. Tests
pass there.
  
rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600

---------

Co-authored-by: modiking <[email protected]>
mingmingl-llvm added a commit that referenced this pull request Apr 1, 2024
…d/write (#66825)

(The profile format change is split into a standalone change into #81691)

* For InstrFDO value profiling, implement instrumentation and lowering for virtual table address.
* This is controlled by `-enable-vtable-value-profiling` and off by default.
* When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads.
 
* Implement profile reader and writer support 
  * Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols.
  * Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't
happen since IR is used to construct InstrProfSymtab.
  * Indexed profile writer collects the list of vtable names, and stores that to index profiles.
  * Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type.
* `llvm-profdata show -show-vtables <args> <profile>` is implemented.

rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler-rt llvm:transforms PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants