Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for size-based output file rotation #42

Open
kempniu opened this issue Mar 29, 2018 · 7 comments
Open

Support for size-based output file rotation #42

kempniu opened this issue Mar 29, 2018 · 7 comments

Comments

@kempniu
Copy link

kempniu commented Mar 29, 2018

BIND allows dnstap output files to be rolled when their size exceeds a configured limit. In order to make that possible, though, a set of inelegant workarounds are required in fstrm-related code. The only way to support such a feature in a clean manner seems to be to get it implemented in fstrm itself. Such an addition could potentially also be welcomed by other fstrm users.

This issue is not really a feature request, but rather a question: would support for size-based output file rolling be considered a welcome addition to fstrm's source code? I am aware that operational usefulness of size-based rolling can be questioned, but apparently there are people interested in using it. What is your opinion?

@edmonds
Copy link
Contributor

edmonds commented Mar 29, 2018

Hey, @kempniu, I'm not much involved in fstrm development these days, but I wrote most of the original code. Can you point me to those "inelegant workarounds" for dnstap file rolling that you refer to in BIND? Just curious how libfstrm is currently being used in BIND.

@kempniu
Copy link
Author

kempniu commented Mar 29, 2018

The following two functions come to mind specifically:

Whenever it is determined that the dnstap output file is to be rolled, named basically stops doing everything else, rolls the output file itself, creates a new fstrm writer thread and increments a static variable which is used by each producer thread (i.e. BIND worker threads) to determine whether the writer thread has been closed since the last time this specific producer thread queued an item to a queue; if so, a new queue is retrieved from the writer thread and subsequently used by the producer thread.

Meanwhile, IIUC, if fstrm supported size-based file rolling "natively", BIND worker threads could just shove data into a queue and let fstrm handle output file rolling. A lot of the complexity could then be removed from BIND, while (probably) not nearly as much code would need to be added to fstrm. But given the doubts surrounding size-based rolling itself, the question is whether more work should be put into supporting it properly (dropping support for size-based rolling altogether is also being considered). Hence me soliciting your opinion :)

@vixie
Copy link
Collaborator

vixie commented Mar 29, 2018

i think fstrm should support the ways people want to use it.

@edmonds
Copy link
Contributor

edmonds commented Mar 29, 2018

@kempniu Yikes, BIND stops doing everything, like you have a giant lock on the whole server? It doesn't respond to queries while the files are being rolled?

Since the fstrm iothr interface only deals with a single writer and there's only a single dedicated I/O thread and we already keep track of output bytes and the current time it would probably be pretty easy to implement timed and sized output rolling.

What if you had something like this in the libfstrm API?

typedef fstrm_res (*fstrm_iothr_writer_replace_callback)(struct fstrm_writer *old_writer,
							 struct fstrm_writer **new_writer,
							 void **user);

fstrm_res
fstrm_iothr_options_set_timed_writer_replace_callback(struct fstrm_iothr_options *opt,
						      unsigned replace_interval,
						      fstrm_iothr_writer_replace_callback,
						      void *user);

fstrm_res
fstrm_iothr_options_set_sized_writer_replace_callback(struct fstrm_iothr_options *opt,
						      uint64_t replace_nbytes,
						      fstrm_iothr_writer_replace_callback,
						      void *user);

With the semantics being that you could provide two callbacks (each callback being a fstrm_iothr_writer_replace_callback function pointer plus user pointer) to be called every replace_interval seconds (timed rolls), or every replace_nbytes of output data written (sized rolls), and the fstrm I/O thread would stop itself and call your provided callback, and let you close the old writer and provide a new writer?

@vixie
Copy link
Collaborator

vixie commented Mar 29, 2018

What if you had something like this in the libfstrm API?

+1.

@kempniu
Copy link
Author

kempniu commented Mar 30, 2018

Yikes, BIND stops doing everything, like you have a giant lock on the whole server? It doesn't respond to queries while the files are being rolled?

Correct, that is what employing isc_task_beginexclusive() results in. There are only a few uses of it outside of server reconfiguration code. That is the price named pays for making it possible to replace an immutable writer residing inside an opaque data structure (struct fstrm_iothr) while multiple producer threads are queueing stuff concurrently. Note that this is not a criticism of fstrm's design, just an explanation of why named is doing what it is doing: task-exclusive mode is what is used for guaranteeing that some worker thread queueing dnstap data (and thus potentially causing the output file to be flushed by fstrm's I/O thread) will not race with another worker thread concurrently reopening the writer (which leads to Bad Things™).

What if you had something like this in the libfstrm API?

The semantics you proposed seem very reasonable to me at this point. Anything allowing safe writer replacement inside an existing fstrm I/O thread should work.

There is one more thing. BIND also allows the dnstap output file to be rolled "at will", using rndc dnstap -reopen. Could the family of functions you proposed perhaps be extended with one more similar function, e.g. fstrm_iothr_options_set_adhoc_writer_replace_callback()? While timed and sized rolls would be triggered by fstrm itself, ad hoc rolls would need to be triggered externally. Yet another function for something like "hey, fstrm, hold my b^Wframes while I replace the writer!" would be needed. Would that be feasible?

@edmonds
Copy link
Contributor

edmonds commented Apr 3, 2018

@kempniu I think BIND should figure out how to replace a pointer being used by multiple threads without needing to block the entire server, e.g. liburcu makes this pretty trivial. If you fixed BIND to not have a giant lock it doesn't appear that there would be a need to complicate the libfstrm API to do all this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants