Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorganize the file source/server defaults for easier internal use #4048

Closed
MOZGIII opened this issue Sep 21, 2020 · 2 comments
Closed

Reorganize the file source/server defaults for easier internal use #4048

MOZGIII opened this issue Sep 21, 2020 · 2 comments
Labels
meta: idea Anything in the idea phase. Needs further discussion and consensus before work can begin. needs: approval Needs review & approval before work can begin. source: file Anything `file` source related source: kubernetes_logs Anything `kubernetes_logs` source related type: task Generic non-code related tasks

Comments

@MOZGIII
Copy link
Contributor

MOZGIII commented Sep 21, 2020

As mentioned at #4043 (comment), it would be great to have a way to opt-out of thinking about some of the configuration parameters for the file_source::FileServer. However, it currently doesn't implement the Default trait.

However, there is a default implementation at the vector::sources::file::FileConfig with a lot of useful defaults.

There are some usability problems we'd like to solve here:

  • one could construct the Default vector::sources::file::FileConfig and take some value from it, but it's suboptimal: in advanced use cases, vector::sources::file::FileConfig parameters don't directly translate to the file_source::FileServer (like when using a custom path_provider);
  • vector::sources::file::FileConfig is scoped to the vector::sources::file, while we might want to use it in other source, i.e. vector::sources::kubernetes_logs;
  • specifying all the parameters inline at each non-standard file_source::FileServer initialization is problematic because then there's no distinction between the parameters that are set to their respective values explicitly or just set to whatever the defaults are presumed to be; as already mentioned, we'd like to have a way to just opt-out of specifying the default values, omit the fields and rely on default-initialization.

To achieve it, the proposed solution is to alter the file_source::FileServer in such a way that it is constructible via some other struct, that has the Default-initializer (with all the "primitve" or trivial values in it), and a set of other parameters that are hard to provide defaults for (generic or otherwise abstracted):

pub struct FileServer<PP, E: FileSourceInternalEvents>
where
    PP: PathsProvider,
{
    pub paths_provider: PP,
    pub emitter: E,
    pub config: Config,
}

pub struct Config {
    pub max_read_bytes: usize,
    pub start_at_beginning: bool,
    pub ignore_before: Option<time::SystemTime>,
    pub max_line_bytes: usize,
    pub data_dir: PathBuf,
    pub glob_minimum_cooldown: Duration,
    pub fingerprinter: Fingerprinter,
    pub oldest_first: bool,
    pub remove_after: Option<Duration>,
}

impl Default for Config {
    // ...
}

This is just a sample code to illustrate the idea, the exact structure is worth proper discussion.

@lukesteensen, do you like this approach?

@MOZGIII MOZGIII added type: feature A value-adding code addition that introduce new functionality. meta: idea Anything in the idea phase. Needs further discussion and consensus before work can begin. source: file Anything `file` source related source: kubernetes_logs Anything `kubernetes_logs` source related type: task Generic non-code related tasks needs: approval Needs review & approval before work can begin. and removed type: feature A value-adding code addition that introduce new functionality. labels Sep 21, 2020
@MOZGIII
Copy link
Contributor Author

MOZGIII commented Sep 21, 2020

Btw, while we at it, we might rename the glob_minimum_cooldown to file_discovery_minimum_cooldown, and clear some comments that seem to be outdated:

https://github.com/timberio/vector/blob/a39cd0bbbd7d89472bedf9044cb425b1465e33d9/lib/file-source/src/file_server.rs#L26-L29

We should also probably add doc-comments to each of the options.

@jszwedko
Copy link
Member

Good ideas here, but it'd be unlikely for us to pick this up without additional motivation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta: idea Anything in the idea phase. Needs further discussion and consensus before work can begin. needs: approval Needs review & approval before work can begin. source: file Anything `file` source related source: kubernetes_logs Anything `kubernetes_logs` source related type: task Generic non-code related tasks
Projects
None yet
Development

No branches or pull requests

2 participants