Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Some enhancements of data file vacuuming for cloud native partition #50342

Conversation

tracymacding
Copy link
Contributor

@tracymacding tracymacding commented Aug 27, 2024

Why I'm doing:

To improve data file vacuuming for cloud native table, some enhancements made in this pr:

  1. Add storage size property for cloud native partition
  2. Start vacuum according to partition data size and storage size on S3
  3. Add more vacuuming metrics on FE

What I'm doing:

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

@@ -387,4 +398,4 @@ private void waitResponse() {
}
}
}
}
} No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
Incorrect condition for removing tablets

You can modify the code like this:

snapshot.tablets.removeIf(t -> ((LakeTablet) t).getDataSizeUpdateTime() < visibleVersionTime &&
                                ((LakeTablet) t).getDataSizeUpdateTime() <= lastSuccVacuumTime);

@@ -508,7 +547,7 @@ public String toString() {
buffer.append("versionEpoch: ").append(versionEpoch).append("; ");
buffer.append("versionTxnType: ").append(versionTxnType).append("; ");

buffer.append("storageDataSize: ").append(storageDataSize()).append("; ");
buffer.append("storageDataSize: ").append(getDataSize()).append("; ");
buffer.append("storageRowCount: ").append(storageRowCount()).append("; ");
buffer.append("storageReplicaCount: ").append(storageReplicaCount()).append("; ");

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
Incorrect variable name used in the setLastSuccVacuumTime method.

You can modify the code like this:

@@ -176,13 +177,42 @@ public boolean isImmutable() {
     }

     @Override
-    public long getLastVacuumTime() {
-        return lastVacuumTime;
+    public long getLastSuccVacuumTime() {
+        return lastSuccVacuumTime;
     }

     @Override
-    public void setLastVacuumTime(long lastVacuumTime) {
-        this.lastVacuumTime = lastVacuumTime;
+    public void setLastSuccVacuumTime(long lastSuccVacuumTime) {
+        this.lastSuccVacuumTime = lastSuccVacuumTime;
+    }

return true;
}

return false;
}

public long getMinRetainVersion() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
Potential logging null pointer when name is not initialized in the shouldVacuum method.

You can modify the code like this:

if ((storageSize - dataSize >= 50L * 1024 * 1024) && (magnification > 0.1)) {
    LOG.debug("Partition: {}, storage size: {}, data size: {}, magnification: {} should vacuum now",
              getName(), getStorageSize(), getDataSize(), magnification);
    return true;
}

@github-actions github-actions bot added the 3.3 label Aug 27, 2024
@tracymacding tracymacding force-pushed the add_storage_size_for_cloud_partition branch from 39546db to a8a39ce Compare August 28, 2024 06:27
@tracymacding tracymacding force-pushed the add_storage_size_for_cloud_partition branch 3 times, most recently from c3338af to 894eb32 Compare August 29, 2024 12:47
Copy link

sonarcloud bot commented Aug 29, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
16.4% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

Copy link

[FE Incremental Coverage Report]

pass : 76 / 95 (80.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/catalog/LocalTablet.java 0 1 00.00% [510]
🔵 com/starrocks/lake/vacuum/AutovacuumDaemon.java 1 10 10.00% [205, 206, 207, 210, 211, 212, 215, 217, 218]
🔵 com/starrocks/service/InformationSchemaDataSource.java 1 3 33.33% [377, 378]
🔵 com/starrocks/catalog/Partition.java 23 27 85.19% [490, 491, 492, 666]
🔵 com/starrocks/catalog/TabletStatMgr.java 8 9 88.89% [306]
🔵 com/starrocks/catalog/PhysicalPartitionImpl.java 23 25 92.00% [200, 216]
🔵 com/starrocks/catalog/MaterializedIndex.java 5 5 100.00% []
🔵 com/starrocks/common/Config.java 2 2 100.00% []
🔵 com/starrocks/catalog/system/information/PartitionsMetaSystemTable.java 1 1 100.00% []
🔵 com/starrocks/common/proc/PartitionsProcDir.java 4 4 100.00% []
🔵 com/starrocks/metric/MetricRepo.java 4 4 100.00% []
🔵 com/starrocks/lake/LakeTablet.java 4 4 100.00% []

…ve partition

              1. Add storage size property for cloud native partition
              2. Start vacuum according to partition data size and storage size on S3
              3. Add more vacuuming metrics on FE

Signed-off-by: tracymacding <[email protected]>
@tracymacding tracymacding force-pushed the add_storage_size_for_cloud_partition branch from 894eb32 to d40d5c9 Compare December 13, 2024 08:57
Copy link

sonarcloud bot commented Dec 13, 2024

Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[BE Incremental Coverage Report]

pass : 49 / 61 (80.33%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 src/storage/lake/vacuum.cpp 4 6 66.67% [284, 286]
🔵 src/storage/lake/tablet_manager.cpp 42 52 80.77% [890, 891, 892, 895, 925, 926, 943, 952, 953, 954]
🔵 src/service/service_be/lake_service.cpp 3 3 100.00% []

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants