Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-16746 mkdirs and s3guard auth mode #1810

Closed

Conversation

steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Jan 16, 2020

This fixes two problems with auth directory flags

  1. mkdirs was creating dir markers without the auth bit, forcing needless scans.
  2. listStatus(path) would reset the auth status bit of all child directories

Issue 2 is possibly the most expensive, as any treewalk using listStatus (e.g globfiles) would clear the auth bit for all child directories before listing them. And this would happen every single time... Essentially you weren't getting authoritative directory listings.

Not yet finished:

* new BulkOperationState type of mkdirs for log
* no longer (accidentally) removing auth flag passed in.
* ignored test now works
* other tests fail because they need a way to mark a dir as non-auth

Change-Id: I975931b22756bc51235868b782aff20286d55681
More testing on newly discovered issue: listStatus marks a child
dir as unauth, even if it was auth

Change-Id: I7d8ab25f5e73e9ea4767e8456f7e0afc92dec28c
@steveloughran steveloughran added bug fs/s3 changes related to hadoop-aws; submitter must declare test endpoint work in progress PRs still Work in Progress; reviews not expected but still welcome labels Jan 16, 2020
This fixes the problem wherein a listStatus of a parent dir
would stamp on the isAuthoritative flag of every child entry.

This was caused by us blindly overriding entries with new ones.

This patch

* avoids writing unchanged entries by building a list of those and passing it
  down to the meta store in the put(DirListMeta) call.
* in DDB: filter out of those entries.
* in non-auth mode, build up a list of entries to add, and write in a batch
  at the end. This is more efficient than the one-by-one operation which
  was being performed, especially as there is no BulkOperationState to cache
  previous work.
* adds tests to verify no DDB writes take place on repeated lists
* fixes up all calls of put() to handle the new list of unchanged entries
* Fixes a bug in TestS3Guard which caused NPEs

The code in S3Guard.dirListingUnion() is a bit ugly as it
has two very different sequences (auth vs nonauth) intermingled.
We should consider splitting up the two union mechanisms,
so as to make it easier to understand how the different update/unions
work.

Change-Id: If5f2427f49a46cb7827f1634b6528ddd0e265778
@apache apache deleted a comment from hadoop-yetus Jan 17, 2020
@steveloughran
Copy link
Contributor Author

The lates patch fixes the problem wherein a listStatus of a parent dir
would stamp on the isAuthoritative flag of every child entry.

This was caused by us blindly overriding entries with new ones.

This patch

  • avoids writing unchanged entries by building a list of those and passing it
    down to the meta store in the put(DirListMeta) call.
  • in DDB: filter out of those entries.
  • in non-auth mode, build up a list of entries to add, and write in a batch
    at the end. This is more efficient than the one-by-one operation which
    was being performed, especially as there is no BulkOperationState to cache
    previous work.
  • adds tests to verify no DDB writes take place on repeated lists
  • fixes up all calls of put() to handle the new list of unchanged entries
  • Fixes a bug in TestS3Guard which caused NPEs

The code in S3Guard.dirListingUnion() is a bit ugly as it
has two very different sequences (auth vs nonauth) intermingled.
We should consider splitting up the two union mechanisms,
so as to make it easier to understand how the different update/unions
work.

Note: we should see a reduction in DDB writes here, in both modes.


Updated PR is failing two tests, both clearly related

[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1 s - in org.apache.hadoop.fs.s3a.ITestBlockingThreadPoolExecutorService
[INFO] Running org.apache.hadoop.fs.s3a.ITestS3GuardWriteBack
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.373 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3GuardWriteBack
[ERROR] testListStatusWriteBack(org.apache.hadoop.fs.s3a.ITestS3GuardWriteBack)  Time elapsed: 6.154 s  <<< FAILURE!
java.lang.AssertionError: Metadata store without write back should still only know about /OnS3AndMS, but it has: DirListingMetadata{path=s3a://hwdev-steve-ireland-new/fork-0004/test/ListStatusWriteBack, listMap={s3a://hwdev-steve-ireland-new/fork-0004/test/ListStatusWriteBack/OnS3AndMS=DDBPathMetadata{isAuthoritativeDir=true, PathMetadata=PathMetadata{fileStatus=S3AFileStatus{path=s3a://hwdev-steve-ireland-new/fork-0004/test/ListStatusWriteBack/OnS3AndMS; isDirectory=true; modification_time=0; access_time=0; owner=stevel; group=stevel; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; isEncrypted=true; isErasureCoded=false} isEmptyDirectory=UNKNOWN eTag=null versionId=null; isEmptyDirectory=UNKNOWN; isDeleted=false; lastUpdated=1579283208710}}, s3a://hwdev-steve-ireland-new/fork-0004/test/ListStatusWriteBack/OnS3=DDBPathMetadata{isAuthoritativeDir=false, PathMetadata=PathMetadata{fileStatus=S3AFileStatus{path=s3a://hwdev-steve-ireland-new/fork-0004/test/ListStatusWriteBack/OnS3; isDirectory=true; modification_time=0; access_time=0; owner=stevel; group=stevel; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; isEncrypted=true; isErasureCoded=false} isEmptyDirectory=UNKNOWN eTag=null versionId=null; isEmptyDirectory=UNKNOWN; isDeleted=false; lastUpdated=1579283208918}}}, isAuthoritative=false, lastUpdated=1579283208450} expected:<1> but was:<2>
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.failNotEquals(Assert.java:834)
	at org.junit.Assert.assertEquals(Assert.java:645)
	at org.apache.hadoop.fs.s3a.ITestS3GuardWriteBack.testListStatusWriteBack(ITestS3GuardWriteBack.java:89)
[ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 53.855 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.auth.ITestRestrictedReadAccess
[ERROR] testNoReadAccess[nonauth](org.apache.hadoop.fs.s3a.auth.ITestRestrictedReadAccess)  Time elapsed: 19.955 s  <<< FAILURE!
java.lang.AssertionError: Expected a java.nio.file.AccessDeniedException to be thrown, but got the result: : true
	at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:499)
	at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:384)
	at org.apache.hadoop.fs.s3a.auth.ITestRestrictedReadAccess.accessDenied(ITestRestrictedReadAccess.java:681)
	at org.apache.hadoop.fs.s3a.auth.ITestRestrictedReadAccess.checkDeleteOperations(ITestRestrictedReadAccess.java:638)
	at org.apache.hadoop.fs.s3a.auth.ITestRestrictedReadAccess.testNoReadAccess(ITestRestrictedReadAccess.java:307)

@steveloughran
Copy link
Contributor Author

@bgaborg this is becoming ready to look at, though there are still tests failing.

@steveloughran
Copy link
Contributor Author

Ok, my patch changes behaviour slightly (I'd actually thought it was a bug, but ITestS3GuardWriteBack thinks it is expected)

With my patch, when you list a dir in nonauth mode, it adds records in DDB for any which don't exist, so helps to build up that full list of files. Currently, in nonauth, we only add changed files.

What to do? We add files to S3Guard on creation, import etc, and in auth-mode we do build up that list. So why not nonauth

Side issue: when we do that listing of a nonauth dir in auth mode, it lasts until the bit is cleared. (which deletes of files do, needlessly). But when reconciling the lists, we don't worry about files listed in S3Guard but not found in the FS. So we mark the dir as authoritative even though it could be that there are errors in the listing. Seems to me we should be looking at the TTL of entries in the original DDB listing and considering missing (expired) files as deleted.

Oh, S3Guard is the pain of my life.

* Out of date entries only propagate from S3 to DDB *when there's a DDB entry*.
  This is the current policy; there's now a constant to change behaviour and more
  discussion about the details. So the next person maintaining it will
  understand what is going on better.
* Only writes the auth bit to an empty dir when the path is auth, rather
  than do it whenever an empty dir is recorded in S3Guard.
  ITestRestrictedReadAccess showed that problem; there's been a bit of
  tuning in that test to make it more robust to configs and previous test runs.

Tune ITestDynamoDBMetadataStoreAuthoritativeMode
* remove superfluous test case
* more asserts about directory states during rename
* enable another ignored test case

Change-Id: I309376f38e9983c901a53a26f89cbf372c7a01da
@apache apache deleted a comment from hadoop-yetus Jan 20, 2020
@apache apache deleted a comment from hadoop-yetus Jan 20, 2020
@steveloughran
Copy link
Contributor Author

latest PR is up; tested s3 ireland with and without ddb, at scale

  • Out of date entries only propagate from S3 to DDB when there's a DDB entry.
    This is the current policy; there's now a constant to change behaviour and more
    discussion about the details. So the next person maintaining it will
    understand what is going on better. (note, also change in ITestS3GuardWriteBack to disable the test when write back always happens)

  • Only writes the auth bit to an empty dir when the path is auth, rather
    than do it whenever an empty dir is recorded in S3Guard.
    ITestRestrictedReadAccess showed that problem; there's been a bit of
    tuning in that test to make it more robust to configs and previous test runs.

Tune ITestDynamoDBMetadataStoreAuthoritativeMode

  • remove superfluous test case
  • more asserts about directory states during rename
  • enable another ignored test case

@steveloughran steveloughran removed the work in progress PRs still Work in Progress; reviews not expected but still welcome label Jan 20, 2020
test tweaks
* remove unused imports
* in MetaStoreTestBase name empty list EMPTY_LIST To make clear
that is what it is

Change-Id: If53bf1c4810d397fe1b8d3d7b4c7c9ceb1344173
Copy link

@bgaborg bgaborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding out why we lost auth on dirListingMetadata.
Looks good to me overall, added some comments and running the test.

// in non-auth listings, we compare the file status of the metastore
// list with those in the FS, and overwrite the MS entry if
// either of two conditions are met
// - there is no entry in the metadata and
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there is no entry in the metastore and

@@ -55,7 +53,7 @@
import static org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides;
import static org.apache.hadoop.fs.s3a.S3AUtils.applyLocatedFiles;
import static org.apache.hadoop.fs.s3a.Statistic.OBJECT_LIST_REQUESTS;
import static org.apache.hadoop.fs.s3a.Statistic.S3GUARD_METADATASTORE_AUTHORITATIVE_DIRECTORIES_UPDATED;
import static org.apache.hadoop.fs.s3a.Statistic.S3GUARD_METADATASTORE_AUTHORITATIVE_DIRECTORIES_UPDATED;import static org.apache.hadoop.fs.s3a.Statistic.S3GUARD_METADATASTORE_RECORD_WRITES;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: import static in new line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, accident. Never even realised you could do that, though i guess it makes sense from a parser perspective

// collection
PathMetadata pathMetadata = originalMD;

if (!isAuthoritative) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we factor out this to another method? Just to make Uncle Bob happy :).
Jokes aside this grew really huge, and we should split the auth and non-auth execution paths to be more maintainable in the future. It was your idea in one of your comments on the PR, and that was a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@steveloughran
Copy link
Contributor Author

thanks, I'll fix the nits and split the two modes out. As noted, I was starting to think that way

Move from to intermingled dirListingUnion algorithms in the same method,
the auth and nonauth merge/update operations have been split into their
own methods. There is a bit of duplication -but at least now the different
operations are isolated enough it's possible to understand them.

TestS3Guard has been extended to help test of some of this.
We can't verify that existing entries don't get overwritten,
but the unit tests are at least checking both algorithms.

Change-Id: I24f3787e31dc3739d0f71b800ed3732b2fd15b94
@steveloughran
Copy link
Contributor Author

steveloughran commented Jan 23, 2020

Move from to intermingled dirListingUnion algorithms in the same method, the auth and nonauth merge/update operations have been split into their own methods. There is a bit of duplication -but at least now the different operations are isolated enough it's possible to understand them.

TestS3Guard has been extended to help test of some of this. We can't verify that existing entries don't get overwritten, but the unit tests are at least checking both algorithms.

Tested: S3 Ireland. One odd test failure which has me worried -but I've seen it happen on another branch without any of this change, or its predecessor in

java.lang.AssertionError: Expected a java.io.FileNotFoundException to be thrown, but got the result: : 16

	at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:499)
	at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:384)
	at org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.expectExceptionWhenReading(ITestS3GuardOutOfBandOperations.java:975)
	at org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.deleteFileInListing(ITestS3GuardOutOfBandOperations.java:959)
	at org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.testListingDelete(ITestS3GuardOutOfBandOperations.java:309)

what is happening is we've deleted a file, "spun" for it to go way from the unguarded FS, then when opening in the guarded FS we see data back.

Need to look more -it could be because my store is versioned -but if so, why fail so fast?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 12s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 9 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 20m 57s trunk passed
+1 💚 compile 0m 30s trunk passed
+1 💚 checkstyle 0m 23s trunk passed
+1 💚 mvnsite 0m 35s trunk passed
+1 💚 shadedclient 15m 6s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 25s trunk passed
+0 🆗 spotbugs 0m 57s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 0m 56s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 32s the patch passed
+1 💚 compile 0m 27s the patch passed
+1 💚 javac 0m 27s the patch passed
-0 ⚠️ checkstyle 0m 19s hadoop-tools/hadoop-aws: The patch generated 28 new + 52 unchanged - 0 fixed = 80 total (was 52)
+1 💚 mvnsite 0m 32s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 15m 12s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 21s the patch passed
+1 💚 findbugs 1m 16s the patch passed
_ Other Tests _
+1 💚 unit 1m 36s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 30s The patch does not generate ASF License warnings.
62m 14s
Subsystem Report/Notes
Docker Client=19.03.5 Server=19.03.5 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1810/5/artifact/out/Dockerfile
GITHUB PR #1810
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 8ac786981b07 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 6c1fa24
Default Java 1.8.0_232
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1810/5/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1810/5/testReport/
Max. process+thread count 348 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1810/5/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

Copy link

@bgaborg bgaborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveloughran thanks for fixing this.
+1, tests were running without any error.

@steveloughran
Copy link
Contributor Author

thanks!

I've accrued a whole set of intermittents and one failure if the bucket is versioned -I should do a fix all patch

@steveloughran steveloughran deleted the s3/HADOOP-16746-mkdirs-auth branch October 15, 2021 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fs/s3 changes related to hadoop-aws; submitter must declare test endpoint
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants