Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New journald source #327

Closed
binarylogic opened this issue May 7, 2019 · 16 comments
Closed

New journald source #327

binarylogic opened this issue May 7, 2019 · 16 comments
Assignees
Labels
needs: requirements Needs a a list of requirements before work can be begin type: feature A value-adding code addition that introduce new functionality.

Comments

@binarylogic
Copy link
Contributor

binarylogic commented May 7, 2019

This is a new source that makes it easy to ingest journald entries. The Rust systemd library should make this easier.

Specification

The first order of business for this source is to spec it out. I would like to start with an investigative process filling in the blanks in my comment below. The questions I have (I'm sure I'm missing some):

  1. How do we filter by service? Does the JournalRecord type include the service source? And would this be a post ingestion filter?
  2. What else can we filter by?
  3. How does checkpointing / cursor position work? Ex: If we restart Vector how do we resume where we left off?

Prior art

@binarylogic binarylogic changed the title Journald source New journald source Jun 20, 2019
@binarylogic binarylogic added the needs: approval Needs review & approval before work can begin. label Jul 12, 2019
@binarylogic
Copy link
Contributor Author

binarylogic commented Jul 19, 2019

@bruceg I'd like for you to start by finishing off the following spec. Please make any adjustments you feel are necessary. There are a number of behavior questions we have (as noted in the original issue) that I would like to solidify before we begin work:


Specification

Config Example

[sources.journald]
  type = "journald"
  current_runtime_only = true # default
  local_only = true # default
  include = { unit = "nginx.service" }
  exclude = { message = ".*ignore this.*", priority = "debug" }
  units = [ "apache2", "system.slice" ]

Requirements

  • current_runtime_only - boolean - if true, include only journal entries from the current boot. If false, include all entries.
  • local_only - boolean - if true, include only journal entries originating from localhost. If false, include all entries.
  • include - a whitelist table of filters/matches to include. If empty or not present, all entries are accepted.
  • exclude - a blacklist table of filters/matches to drop. Defaults to none. Note: this takes applies after include and so takes precedence.
  • units - an array of unit names to monitor. If empty or not present, all units are accepted. Unit names lacking a . will have .service appended to make them a valid service unit name.
  • Vector will checkpoint the record most recently processed and continue from that point on restart, to avoid duplicating logs.

@binarylogic binarylogic added needs: requirements Needs a a list of requirements before work can be begin and removed needs: approval Needs review & approval before work can begin. labels Jul 19, 2019
@bruceg
Copy link
Member

bruceg commented Jul 20, 2019

Should the include & exclude keys be arbitrary journald fields (like MESSAGE or _COMM etc) or a fixed set of allowed possibilities (like message or command), or some combination where pre-programmed keys have fixed behavior but all others reference literal journald fields?

I think some will have to be fixed, like priority where it's a number and you'd want to compare greater than for include and less than for exclude, but it would be useful to be able to search or filter on any key.

@bruceg
Copy link
Member

bruceg commented Jul 20, 2019

How do we foresee handling the privilege boundary issues? The files for journald are not directly readable by non-privileged users (on my system, readable by root or groups systemd-journal, adm, or wheel). The easiest option would be to just require that vector is run with the appropriate supplementary group where journald support is required (which obviously needs to be well documented), but another option would be a separate privileged process to feed in the logs.

@bruceg
Copy link
Member

bruceg commented Jul 20, 2019

1. How do we filter by service? Does the `JournalRecord` type include the service source? And would this be a post ingestion filter?

journald records include a large number of fields to filter on. For selecting a service, there is SYSLOG_FACILITY, SYSLOG_IDENTIFIER, _COMM (executable path), _CMDLINE (command and arguments), _SYSTEMD_UNIT and _SYSTEMD_USER_UNIT (for things launched by systemd), _KERNEL_SUBSYSTEM (for kernel log messages), and occasionally UNIT (though this isn't a "trusted" field).

2. What else can we filter by?

See man systemd.journal-fields

3. How does checkpointing / cursor position work? Ex: If we restart Vector how do we resume where we left off?

The systemd crate referenced above has 3 methods of producing a checkpoint value and later seeking back to that checkpoint. This would require storing that checkpoint in a file or table somewhere, and being careful not to allow a race to drop records.

@binarylogic
Copy link
Contributor Author

Thanks @bruceg! So I don't create misdirection, I'd like to use planning tomorrow morning to obtain consensus on a direction. I'll follow up with answers tomorrow.

@binarylogic
Copy link
Contributor Author

We spent some time discussing this during planning. So I'll answer your questions in order:

Should the include & exclude keys be arbitrary journald fields (like MESSAGE or _COMM etc) or a fixed set of allowed possibilities (like message or command), or some combination where pre-programmed keys have fixed behavior but all others reference literal journald fields?

To start, we'd like to only include service filters. You are welcome to rename this filter if you find it to be clearer (ex: include_services, exclude_services). The filters for this source should be obvious fundamental filters that relate directly to the source behavior. Otherwise, you get into questions like "would a filter be better off in a separate filter transform"?. Does that make sense?

How do we foresee handling the privilege boundary issues?

I think we go with what you suggested for now, which is "The easiest option would be to just require that vector is run with the appropriate supplementary group where journald support is required (which obviously needs to be well documented)".

The systemd crate referenced above has 3 methods of producing a checkpoint value and later seeking back to that checkpoint. This would require storing that checkpoint in a file or table somewhere, and being careful not to allow a race to drop records.

Currently, the file source stores checkpoint values in individual files in the data_dir, but @lukesteensen recommended using leveldb as a way to store this data. We're on the fence with the best approach here and would like to delegate that decision to you -- whatever you think is the easiest. I would recommend designing this in such a way that we could swap out the underlying persistence strategy in the future.


That should answer most of your questions. This section will outline other details we discussed:

Fields / Schema

We'd like to take all default fields and leave them as unaltered root keys. For example, _SYSTEMD_UNIT would remain exactly as that field:

{
  // ...
  "_SYSTEMD_UNIT": "vector.service",
  "SYSLOG_FACILITY": "debug",
  "SYSLOG_PID": 123,
  // ...
}

The only change we'd like to make is mapping the relevant keys to our default schema: timestamp, message, and host. These should be configurable via timestamp_field, message_field, and host_field configuration options. Any other transformations the user would like to make, such as dropping or renaming fields can be accomplished with a separate transform component. Does that make sense?

First Version

Keep in mind, we don't need everything in the first version. Perhaps it makes more sense to start with a simple source that does not include any filtering or field mapping, then we can work on follow up changes to add those features.


Let me know if that answers everything, happy to clarify further.

@bruceg
Copy link
Member

bruceg commented Jul 22, 2019

To start, we'd like to only include service filters. You are welcome to rename this filter if you find it to be clearer (ex: include_services, exclude_services). The filters for this source should be obvious fundamental filters that relate directly to the source behavior. Otherwise, you get into questions like "would a filter be better off in a separate filter transform"?. Does that make sense?

By "service" do you really mean any systemd unit, or specifically only a systemd .service unit? (there are also .device, .mount, .target units, etc).

Currently, the file source stores checkpoint values in individual files in the data_dir, but @lukesteensen recommended using leveldb as a way to store this data. We're on the fence with the best approach here and would like to delegate that decision to you -- whatever you think is the easiest. I would recommend designing this in such a way that we could swap out the underlying persistence strategy in the future.

If we are going to have multiple sources and/or sinks doing checkpointing, it would be a good idea to have a unified storage scheme for them. I have no strong preference between individual files or a database. Files are easier to debug but a database can make storing additional data easier. There are some potential consistency issues (like partial writes) that would best be hidden behind a common interface.

I am curious why LevelDB is being proposed over other key/value stores, particularly LMDB.

The only change we'd like to make is mapping the relevant keys to our default schema: timestamp, message, and host. These should be configurable via timestamp_field, message_field, and host_field configuration options. Any other transformations the user would like to make, such as dropping or renaming fields can be accomplished with a separate transform component. Does that make sense?

I don't see a point in having those configurable, at least initially. Are there any field names other than MESSAGE used for the message text? The timestamp and host are also trusted fixed fields injected by journald.

Keep in mind, we don't need everything in the first version. Perhaps it makes more sense to start with a simple source that does not include any filtering or field mapping, then we can work on follow up changes to add those features.

Given some of the disagreement, this is probably best. Get the minimum working and then increment as needs arise.

@bruceg
Copy link
Member

bruceg commented Jul 22, 2019

Oh, I forgot to ask. The initial spec included a "paths" configuration, but the systemd crate doesn't expose this functionality yet. Is it safe to presume I can drop this from the initial spec?

@binarylogic
Copy link
Contributor Author

By "service" do you really mean any systemd unit, or specifically only a systemd .service unit? (there are also .device, .mount, .target units, etc).

Interesting, I hadn't thought about that. I'm not entirely sure if collecting log data for .device, .mount, .target units is useful. From what I can gather it doesn't seem to be. I'm leaning towards just .service units, but if you understand the other unit types better please feel free to make your best judgement.

If we are going to have multiple sources and/or sinks doing checkpointing, it would be a good idea to have a unified storage scheme for them.

👍 from me on that. If you'd like to extract that out, that would be fine. I'm slightly concerned that could expand scope on this PR. What do you think about the initial PR forgoing check-pointing, and then we can add that in a follow up PR? Again, I'll defer to you on the best way to approach this.

I am curious why LevelDB is being proposed over other key/value stores, particularly LMDB.

We currently use leveldb for our on-disk buffering and I believe it had a number of requirements we needed there. Specifically around ordering. And we suggested leveldb just because we use it for that / we're trying to reduce the amount of dependencies we need.

I don't see a point in having those configurable, at least initially. Are there any field names other than MESSAGE used for the message text? The timestamp and host are also trusted fixed fields injected by journald.

Agree, let's skip that for now.

Oh, I forgot to ask. The initial spec included a "paths" configuration, but the systemd crate doesn't expose this functionality yet. Is it safe to presume I can drop this from the initial spec?

Yep, we can drop the paths configuration for this first version. We might not need it.

@bruceg
Copy link
Member

bruceg commented Jul 23, 2019

Interesting, I hadn't thought about that. I'm not entirely sure if collecting log data for .device, .mount, .target units is useful. From what I can gather it doesn't seem to be. I'm leaning towards just .service units, but if you understand the other unit types better please feel free to make your best judgement.

Then I will mimic systemd's behavior: automatically append .service if the unit name has no .. That way, most valid uses will simply be the service name, but all other units could potentially be used as well.

+1 from me on that. If you'd like to extract that out, that would be fine. I'm slightly concerned that could expand scope on this PR. What do you think about the initial PR forgoing check-pointing, and then we can add that in a follow up PR? Again, I'll defer to you on the best way to approach this.

The problem with forgoing check-pointing is that each time the journal is opened, it is re-read from the start. This can potentially be a huge amount of data on long-running machines. I will start without this capability, but it will only be useful for testing so I'll try to work it in for the initial PR.

@binarylogic
Copy link
Contributor Author

The problem with forgoing check-pointing is that each time the journal is opened, it is re-read from the start.

Yep, understood. We wouldn't advertise the integration until checkpointing is done. I just thought it might be more focused on a development and review perspective to break them out.

@bruceg
Copy link
Member

bruceg commented Jul 30, 2019

Checkpointing is going to depend on issue #644 / pull #673 in particular the global data_dir option.

@binarylogic
Copy link
Contributor Author

@bruceg nice work on these changes. We can close this, correct?

@bruceg
Copy link
Member

bruceg commented Sep 13, 2019

I did not close it as two of the spec points are unfinished (the include and exclude filters). There should be a transform that can do this, but there is none that completely fits.

@LucioFranco
Copy link
Contributor

@bruceg I think then we should open an issue for that transform then close this one.

@bruceg
Copy link
Member

bruceg commented Sep 17, 2019

Closing this via #882

@bruceg bruceg closed this as completed Sep 17, 2019
@binarylogic binarylogic added type: feature A value-adding code addition that introduce new functionality. and removed type: new feature labels Jun 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs: requirements Needs a a list of requirements before work can be begin type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

No branches or pull requests

3 participants