Skip to content

Commit

Permalink
HADOOP-18752. Change fs.s3a.directory.marker.retention to "keep" -eve…
Browse files Browse the repository at this point in the history
…rything but the switch

This change has all of PR #5689 *except* for changing the
default value of marker retention from keep to delete.

1. leaves the default value of fs.s3a.directory.marker.retention
   at "delete"
2. no longer prints a message when an S3A FS instances is
   instantiated with any option other than delete.
3. Updates the directory marker documentation

Switching to marker retention improves performance
on any S3 bucket as there are no needless marker DELETE requests
-leading to a reduction in write IOPS and and any delays waiting
for the DELETE call to finish.

There are *very* significant improvements on versioned buckets,
where tombstone markers slow down LIST operations: the more
tombstones there are, the worse query planning gets.

Having versioning enabled on production stores is the foundation
of any data protection strategy, so this has tangible benefits
in production.

Marker deletion is *not* compatible with older hadoop releases;
specifically
- Hadoop branch 2 < 2.10.2
- Any release of Hadoop 3.0.x and Hadoop 3.1.x
- Hadoop 3.2.0 and 3.2.1
- Hadoop 3.3.0
Incompatible releases have no problems reading data in stores
where markers are retained, but can get confused when deleting
or renaming directories.

Contributed by Steve Loughran

Change-Id: Ic9a05357a4b1b1ff6dfecf8b0f30e1eeedb2fe75
  • Loading branch information
steveloughran committed Jul 19, 2023
1 parent 236b9aa commit 3be8800
Show file tree
Hide file tree
Showing 6 changed files with 139 additions and 157 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -186,11 +186,11 @@ public static DirectoryPolicy getDirectoryPolicy(
policy = DELETE;
break;
case DIRECTORY_MARKER_POLICY_KEEP:
LOG.info("Directory markers will be kept");
LOG.debug("Directory markers will be kept");
policy = KEEP;
break;
case DIRECTORY_MARKER_POLICY_AUTHORITATIVE:
LOG.info("Directory markers will be kept on authoritative"
LOG.debug("Directory markers will be kept on authoritative"
+ " paths");
policy = new DirectoryPolicyImpl(MarkerPolicy.Authoritative,
authoritativeness);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -792,7 +792,7 @@ Security
Delegation token support is disabled
Directory Markers
The directory marker policy is "delete"
The directory marker policy is "keep"
Available Policies: delete, keep, authoritative
Authoritative paths: fs.s3a.authoritative.path=```
```
Expand Down
Loading

0 comments on commit 3be8800

Please sign in to comment.