-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ccl/changefeedccl: respect filesystem walk errors in cloudFeed #43006
ccl/changefeedccl: respect filesystem walk errors in cloudFeed #43006
Conversation
This happens when the file disappears between getting the filenames in the directory and lstat-ing the info to pass it to walkFn. In our tests, this happens because export storage first writes the file under a temporary filename and then renames it to the final path. Previously, the temporary file was written in the system temp, but as of a recent change, it's written in the final dir. This was causing issues, so this morning knz changed the temp files to have a ".tmp" suffix which this walkFn already ignores. If you rebased this commit back to just before #42987, then I bet it would fail under stress. |
Ah, I missed that fly by. This still seems like a nice complementary solution. TFTR. bors r+ |
Build failed (retrying...) |
i think this broke something (/ hid something that is broken):
|
bors r- |
Canceled |
Closes cockroachdb#42979. This commit properly handles errors in `cloudFeed`'s `filepath.WalkFunc` implementation. The documentation says: > If there was a problem walking to the file or directory named by > path, the incoming error will describe the problem and the function > can decide how to handle that error (and Walk will not descend into > that directory). If an error is returned, processing stops. I stressed the test for 10,000 iterations and never saw anything, so it's possible that this was a fluke. It's not clear what filesystem error this was throwing so we might still see this pop up again, but at least we'll now correctly propagate the error and surface it instead of hitting an NPE. Because of that, I'm closing the issue for now. Release note: None
6b68391
to
0104746
Compare
Yeah, I think this hit the error case Dan mentioned and broke
The error handling code should have fired only after checking for and ignoring .tmp files. Sorry for the disruption. Done. bors r+ |
42952: pgdate: fix parsing of dates before unix epoch r=otan a=otan Resolves #42937. Previously, pgdate handling dates before the unix epoch with time attached to it rounded up instead of rounding down, meaning times such as '1969-12-30 01:00:00' rounded up to days = `-1`, when it should be days = `-2`. This PR addresses that change. I will probably look at backporting this. Release note (bug fix): We did not previously handled date casts from timestamp/timestamptz with time attached to it for times before the unix epoch correctly. For example, '1969-12-30 01:00:00'::timestamp would round to '1969-12-31' instead of '1969-12-30'. This PR addresses that change. 43006: ccl/changefeedccl: respect filesystem walk errors in cloudFeed r=nvanbenschoten a=nvanbenschoten Closes #42979. This commit properly handles errors in `cloudFeed`'s `filepath.WalkFunc` implementation. The documentation says: > If there was a problem walking to the file or directory named by > path, the incoming error will describe the problem and the function > can decide how to handle that error (and Walk will not descend into > that directory). If an error is returned, processing stops. I stressed the test for 10,000 iterations and never saw anything, so it's possible that this was a fluke. It's not clear what filesystem error this was throwing so we might still see this pop up again, but at least we'll now correctly propagate the error and surface it instead of hitting an NPE. Because of that, I'm closing the issue for now. Release note: None 43009: row: move cfetcher initialization validation steps outside of loop r=yuzefovich a=rohany Some initialization checks that the cfetcher performed were occuring in a loop for no reason. This PR moves them out of the loop. Release note: None 43010: cmd/roachtest: only log sqlsmith on non-errors r=mjibson a=mjibson Release note: None Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Rohan Yadav <[email protected]> Co-authored-by: Matt Jibson <[email protected]>
Build succeeded |
Closes #42979.
This commit properly handles errors in
cloudFeed
'sfilepath.WalkFunc
implementation. The documentation says:
I stressed the test for 10,000 iterations and never saw anything, so it's
possible that this was a fluke. It's not clear what filesystem error this was
throwing so we might still see this pop up again, but at least we'll now
correctly propagate the error and surface it instead of hitting an NPE.
Because of that, I'm closing the issue for now.
Release note: None