-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The problem with fingerprinters #2701
Comments
|
Do you mean there's no issue here?
Why is that btw? Is it due to issues with the
Reducing the byte size of the We have a solution that requires manual configuration and serious insight. Users have to estimate the meaningful amount of bytes that contribute to the checksum based on their use case. Incorrect estimates will result in lost log events. |
Oh, the good ol' fingerprinting conversation 😄. @MOZGIII we settled on the
Therefore, If you have a silver bullet, I'm all for it. Otherwise, we might consider requiring users to choose a strategy instead of defaulting to one. If you have improvements you'd like to make please offer them in a succinct constructive manner. If we agree there are improvements to be had, we'll need an RFC. Just because this same discussion keeps popping up and an RFC would prevent that going forward. |
Checksums are more direct, more robust, less dependent on platform-specific implementation details, etc, etc.
Absolutely not. That would require making two massive assumptions:
I very strongly disagree with this. The vast, vast majority of use cases for log shipping involve files that quickly grow past 256 bytes. It would be great if we a solution that handled everything that checksumming handled while also taking care of logs that consist solely of |
Checksums don't work for all files - that's a major downside, in my view - much more problematic than everything that you've listed combined.
I understand where you're coming from, and I even mostly agree. But this is just choosing to ignore the problem - the cases where checksum fingerprinter doesn't work. I've stumbled upon this in practice too much to get annoyed to a point I created this issue. I just want to fix it 😄 I agree there's no solution in sight in the framework that's been built so far.
Interesting. Wouldn't I also have another idea. What if we treat files that aren't long enough as if they have a zero checksum? Then we'd need a mechanism to transition such files from the |
I somehow missed #2701 (comment).
😄 I know this has been discussed a lot before, but apparently things are still not there yet.
Ok, some concrete points! Thx!
No silver bullets (so far). Just some ideas (a couple) I'd like to discuss, pre RFC. |
If you have a solution that works for all situations I would be very eager to hear about it. All of the implementations I'm aware of have tradeoffs and we've chosen checksums because the downside (requiring that files be a certain configurable size) impacts the least important use case.
To be clear, no one expects that and that is not at all what I meant by platform-independent. There are meaningful differences in behavior across platforms and filesystems. This causes problems with tools that use device/inode fingerprinting that must be documented and worked around.
We do not ignore the problem. We provide multiple ways to adjust the behavior if your use case is not well supported by the defaults.
Please be more specific than "files of mixed nature". Obviously we want the tool to work well for as many circumstances as possible. Are you stating that k8s deployments routinely produce log files that are less than 256 bytes in size?
No, because of issues like inode reuse.
Then we would consider all such files to be the same and you couldn't have more than one of them. A zero checksum would not uniquely identify any of them.
You should not take the mere presence of inodes as evidence that there are no meaningful behavior differences here.
That is not a problem that is solved by adding abstraction.
There is a very meaningful difference between being broken and making necessary tradeoffs. |
Ok, I think I figured a way to introduce a significant improvement with minimal efforts: #2890. |
Closing this in favor of #2926. |
Currently, we have two file fingerprinters:
Fingerprinter::DevInode
- uses file inode number to compute the checksumFingerprinter::Checksum
- uses a few first bytes from the file to compute the checksumYou'd think the
Fingerprinter::DevInode
is the preferred one, but its usage can cause problems: upon the log file rotation, if the file is truncated instead of moved and recreated, and then written from scratch, the inode fingerprinter will fail to properly distinguish between the two. This will lead to incorrect behavior at the checkpoints, and the system will stop reading the logs.Fingerprinter::Checksum
mostly works, but it has a nasty problem - when files are too small, the fingerprinter will skip them, and we won't ever get data from them. This is usually not a big deal when collecting logs from the long-running services - they usually log plenty, but it doesn't work for one-time jobs that write just a tiny summary of the operation result to the log file.Currently, there's no solution we can offer that would just work.
Let's discuss our options to solve this problem.
Potentially relevant issues: #2163.
The text was updated successfully, but these errors were encountered: