-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix filestream flaky truncated file test #35759
Conversation
f5a5142
to
1d0f26c
Compare
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
This commit fixes the formatting issue and only logs the error if it is not `context.Canceled`
This commit fixes and improves some log messages.
This commit fixes a flaky test. The flakiness was due to the modification date of a file not changing between two different operations.
1d0f26c
to
e16c542
Compare
This commit brings the improvements from elastic#35754 to this PR.
08b431e
to
e13e183
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Based on #35571 (comment) Would it not be better to remove the modification time check in the block below, and instead look at the size first? beats/filebeat/input/filestream/fswatch.go Lines 157 to 171 in 4af3e60
Can't this problem happen outside of our tests? It looks like we are only fixing the test, and not what seems to be an actual limitation in Filebeat: // Sleep for a second before modifying the file again
// this ensures the modification time of the file will
// be different between the last write and the truncation.
time.Sleep(time.Second) |
I've been thinking a lot about it since yesterday and I believe the filestream input works fine (maybe there is a very edge case not fully covered). On a "normal" operation that is what I expect to happen:
The problem with the test is that we write data then truncate the file within the same second (that's the failure scenario), then we never touch it again. Which effectively puts the file on this "limbo". However there is no new data to be harvested. The only edge case I see is if outside of a test scenario 3 and 4 also happen within the same second and no other data is appended to the file, in that case we'd lose this bit of data. This is a very specific, and likely very rare, edge case. Our documentation already warns about truncating files and the problems it can bring: https://www.elastic.co/guide/en/beats/filebeat/current/file-log-rotation.html I haven't looked the internals of our copy truncate prospector (https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-filestream.html#_rotation_external_strategy_copytruncate), but it's supposed to help on those situations, however it's not GA yet.
That is indeed an option. I would change beats/filebeat/input/filestream/fswatch.go Line 157 in 4af3e60
so we don't affect the "resend on mod time" feature. |
fswatcher now uses the modification time and changes in the file size to trigger either a truncate or write event.
After thinking more about it and talking with @cmacknz , I decided to implement his suggestion from #35759 (comment). @faec, @leehinman could you have a final look at it? |
bfb6387 introduced changes to the fswatcher
5392af1
to
7d7659b
Compare
Accidentally pushed to this branch because I had it checked out 🤦♂️ , force pushed the commits away. |
❕ Build Aborted
Expand to view the summary
Build stats
Test stats 🧪
Steps errorsExpand to view the steps failures
|
fswatcher now uses the modification time and changes in the file size to trigger either a truncate or write event. Other small changes: - Fixed some log formatting issue and only log the error if it is not `context.Canceled` - Fixes and improves some log messages. - Brings the improvements from #35754 to this PR. (cherry picked from commit bdb67bc) # Conflicts: # filebeat/input/filestream/filestream.go # filebeat/input/filestream/internal/input-logfile/clean.go
fswatcher now uses the modification time and changes in the file size to trigger either a truncate or write event. Other small changes: - Fixed some log formatting issue and only log the error if it is not `context.Canceled` - Fixes and improves some log messages. - Brings the improvements from #35754 to this PR. (cherry picked from commit bdb67bc) # Conflicts: # filebeat/input/filestream/filestream.go # filebeat/input/filestream/internal/input-logfile/clean.go --------- Co-authored-by: Tiago Queiroz <[email protected]>
What does this PR do?
This PR fixes a flaky test. The flakiness was due to the
modification date of a file not changing between two different
operations.
It also improves some of
inputTestingEnvironment
and fixes some other log messages.A more detailed explanation is on the related issue.
Why is it important?
Checklist
- [ ] I have made corresponding changes to the documentation- [ ] I have made corresponding change to the default configuration files- [ ] I have added tests that prove my fix is effective or that my feature works- [ ] I have added an entry inCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.## Author's Checklist## How to test this PR locally## Related issues## Use cases## Screenshots## Logs