Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.raw files are not closed automatically when using rotation-period #82

Open
MiniPierre opened this issue Apr 29, 2022 · 14 comments
Open

Comments

@MiniPierre
Copy link
Contributor

Hi,
I have a problem with compactor not performing the log rotation when configured with the dnstap-socket option. The .raw file will not close until receiving new data, thus hanging the data previously collected. Could you have a look ?

@MiniPierre MiniPierre changed the title .raw files are not closed when using rotation-period with dnstap socket .raw files are not closed automatically when using rotation-period with dnstap socket Apr 29, 2022
@MiniPierre
Copy link
Contributor Author

A colleague of mine just informed me that the same problem occurs when compactor is listening to network interfaces, it seems that it is a global misfunctioning

@MiniPierre MiniPierre changed the title .raw files are not closed automatically when using rotation-period with dnstap socket .raw files are not closed automatically when using rotation-period Apr 29, 2022
@saradickinson
Copy link
Contributor

Hi,

Can you provide a little more detail after reading the description below? And if you are using compactor 1.2.0 could you enable log-file-handling if you are able to reproduce the problem?

One thing to outline here is that is a 'feature' of the compactor design that the file rotation is not triggered immediately at the rotation-period time but instead occurs when compactor receive the first packet following the expiration of the rotation-period. This does mean that under low traffic conditions the file will only close and roll over when that next packet arrives. So for example, if you have a 5 minute file rotation but only 1 packet arriving every 4 minutes then

  • compactor is started at time T, creates file T.cdns.raw and sees packet N
  • at time T+4 (4 minutes later) packet N+1 arrives
  • at time T+5 the rotation period expires, but no rollover occurs
  • at time T+8packet N+2 arrives and file rollover is triggered, closing T.cdns.raw and opening T+8.cdns.raw

The result is both a delayed file rotation and a file with a collection period of 8 minutes in this case. Note that compactor does store both a start and end time for a file, allowing the actually collection period to be calculated and these numbers can seen in the .info file generated with inspector.

I thought we had this behaviour described in detail in the documentation but looking through the docs I don't actually see it... We've had discussions with other users over this behaviour but once they understood it they updated their analysis tools to use the actual file duration to calculate packet rates, rather than assume it was always fixed.

I wonder if this is what you are seeing or if you have found a different issue? Or perhaps this presents a different operational problem for you?

@MiniPierre
Copy link
Contributor Author

Hi,
The behaviour you are describing is exactly the issue we are having. Our problem is that we are collecting CDNS files and aggregate them every minute, thus collecting logs at the precise minute where they have been created is an essential feature of our architecture.
Since it seems to be the expected behaviour of compactor, asking for it to change by default could be problematic. Is it possible to add an option to force the file rollover after period expiration even if no packet arrives ?

@saradickinson
Copy link
Contributor

It is not a simple extension of current code but I can look at the options in more detail and get back to you next week.

The other 'workaround' to consider is sending a 'dummy' DNS packet to trigger the C-DNS file rollover - pollutes your traffic but will do the job.....

@MiniPierre
Copy link
Contributor Author

We have considered such workaround, but we'd like not to rely on an external host doing a DNS query to ensure log rotation (and we can't perform the query locally for specific architecture reasons).
I saw that compactor has a signal handler, could it be possible as a quick fix to intercept a specific signal that would force log rotation ?

@saradickinson
Copy link
Contributor

That's what I'm looking at - the slightly trickier part is cleanly passing the event to the threads doing the writing but I think it is doable. And also need to check that the downstream tools can handle an empty file, which this change would make possible. I'll work on this more next week (didn't get enough time this week).

As you'll see in the manual a SIGHUP will cause the compactor to re-read the config file which also causes file rotation. However this does lead to a small time interval when traffic isn't captured as the internal packet sniffer resets too. For a node not getting any traffic (and hence not rotating) that might not be a problem but it isn't a robust or reliable solution. I mention it just incase you want to experiment.

@saradickinson
Copy link
Contributor

Update: OK - I think I have a way to do this and will try to get a 1.2.2-beta1 release out early next week.

@saradickinson
Copy link
Contributor

There is a 1.2.2-beta1 release available now with an additional mechanism that forces a C-DNS file rotation when a SIGUSR1 is received. It is recommended to use either the configuration parameters OR a SIGUSR1 to rotate files as trying to use both with the same rotation period will lead to a race condition that could produce extra files.

Please test this and see if it meets your use case. Now I have the mechanism figured out I can look to improve the internal triggering to run off an internal timer in future.

@MiniPierre
Copy link
Contributor Author

Thank you so much for your quick reply ,we'll have a look tomorrow and get back to you as soon as possible

@MiniPierre
Copy link
Contributor Author

Hi, the functionnality seems to work correctly, thank you very much ! However, as it was only implemented for network interface sniffing, I made a PR to allow this functionnality with dnstap socket, which was our main use case.

@saradickinson
Copy link
Contributor

Thanks for the feedback! My apologies - I misread that the issue was with DNSTAP and had that additional change (and the PCAP triggers) queued up for the next release. I've merged your PR and will try to get another package generated on Monday.

@saradickinson
Copy link
Contributor

1.2.2-rc1 is now available with the DNSTAP implementation.

@saradickinson
Copy link
Contributor

@MiniPierre I plan a 1.2.2 production release next week - if you have any feedback on rc1 please let me know!

@MiniPierre
Copy link
Contributor Author

Sorry for the late feedback, we deployed the rc1 version and everything's working as expected !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants