-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New journald
source
#327
Comments
@bruceg I'd like for you to start by finishing off the following spec. Please make any adjustments you feel are necessary. There are a number of behavior questions we have (as noted in the original issue) that I would like to solidify before we begin work: SpecificationConfig Example[sources.journald]
type = "journald"
current_runtime_only = true # default
local_only = true # default
include = { unit = "nginx.service" }
exclude = { message = ".*ignore this.*", priority = "debug" }
units = [ "apache2", "system.slice" ] Requirements
|
Should the I think some will have to be fixed, like |
How do we foresee handling the privilege boundary issues? The files for journald are not directly readable by non-privileged users (on my system, readable by |
journald records include a large number of fields to filter on. For selecting a service, there is
See
The systemd crate referenced above has 3 methods of producing a checkpoint value and later seeking back to that checkpoint. This would require storing that checkpoint in a file or table somewhere, and being careful not to allow a race to drop records. |
Thanks @bruceg! So I don't create misdirection, I'd like to use planning tomorrow morning to obtain consensus on a direction. I'll follow up with answers tomorrow. |
We spent some time discussing this during planning. So I'll answer your questions in order:
To start, we'd like to only include service filters. You are welcome to rename this filter if you find it to be clearer (ex:
I think we go with what you suggested for now, which is "The easiest option would be to just require that vector is run with the appropriate supplementary group where journald support is required (which obviously needs to be well documented)".
Currently, the That should answer most of your questions. This section will outline other details we discussed: Fields / SchemaWe'd like to take all default fields and leave them as unaltered root keys. For example, {
// ...
"_SYSTEMD_UNIT": "vector.service",
"SYSLOG_FACILITY": "debug",
"SYSLOG_PID": 123,
// ...
} The only change we'd like to make is mapping the relevant keys to our default schema: First VersionKeep in mind, we don't need everything in the first version. Perhaps it makes more sense to start with a simple source that does not include any filtering or field mapping, then we can work on follow up changes to add those features. Let me know if that answers everything, happy to clarify further. |
By "service" do you really mean any systemd unit, or specifically only a systemd
If we are going to have multiple sources and/or sinks doing checkpointing, it would be a good idea to have a unified storage scheme for them. I have no strong preference between individual files or a database. Files are easier to debug but a database can make storing additional data easier. There are some potential consistency issues (like partial writes) that would best be hidden behind a common interface. I am curious why LevelDB is being proposed over other key/value stores, particularly LMDB.
I don't see a point in having those configurable, at least initially. Are there any field names other than
Given some of the disagreement, this is probably best. Get the minimum working and then increment as needs arise. |
Oh, I forgot to ask. The initial spec included a "paths" configuration, but the |
Interesting, I hadn't thought about that. I'm not entirely sure if collecting log data for
👍 from me on that. If you'd like to extract that out, that would be fine. I'm slightly concerned that could expand scope on this PR. What do you think about the initial PR forgoing check-pointing, and then we can add that in a follow up PR? Again, I'll defer to you on the best way to approach this.
We currently use leveldb for our on-disk buffering and I believe it had a number of requirements we needed there. Specifically around ordering. And we suggested leveldb just because we use it for that / we're trying to reduce the amount of dependencies we need.
Agree, let's skip that for now.
Yep, we can drop the |
Then I will mimic systemd's behavior: automatically append
The problem with forgoing check-pointing is that each time the journal is opened, it is re-read from the start. This can potentially be a huge amount of data on long-running machines. I will start without this capability, but it will only be useful for testing so I'll try to work it in for the initial PR. |
Yep, understood. We wouldn't advertise the integration until checkpointing is done. I just thought it might be more focused on a development and review perspective to break them out. |
@bruceg nice work on these changes. We can close this, correct? |
I did not close it as two of the spec points are unfinished (the |
@bruceg I think then we should open an issue for that transform then close this one. |
Closing this via #882 |
This is a new source that makes it easy to ingest journald entries. The Rust
systemd
library should make this easier.Specification
The first order of business for this source is to spec it out. I would like to start with an investigative process filling in the blanks in my comment below. The questions I have (I'm sure I'm missing some):
JournalRecord
type include the service source? And would this be a post ingestion filter?Prior art
The text was updated successfully, but these errors were encountered: