-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[networking] plugin can load 7 kernel modules #1435
Comments
Possible approaches to fix:
I tried writing patch for 2nd option (
Running that:
|
Absolutely no way on earth. You cannot ever do this correctly and safely, since the kernel module infrastructure is shared system wide and has no locking mechanism for us to exclude other users while we do our work. You cannot ever know that between your 1st list and 2nd list that the admin did not change the set of loaded modules. That means you may randomly try to unload modules that the admin just deliberately loaded. This has to be handled by detecting when something would be changed, and not doing that thing. |
Right, havent realized that. So we have to:
|
Yeah, I think that's the way. We already have I was talking on IRC about how this might be a good use of the "predicates" idea. We can of course do it with branches, but I envisaged something like this: ss_pred = SoSPredicate(kmods=["tcp_diag", "udp_diag", "ip_diag", "unix_diag", "netlink_diag"])
self.add_cmd_output("ss -peaonmi", pred=ss_pred) Then the command is collected only if And yes - those |
I thought we could also make it more fine-grained, but it looks like just running If we could split it on options we could do something like: ss_opts = SoSOptionArgs(kmods={"tcp_diag":"-t", "udp_diag": "-u"})
self.get_cmd_output("ss -peaonmi", opt_args=ss_opts) But that's not going to work with the way |
I like the idea of I am in favour of:
|
We already collect and store that in
Has the drawback that we will miss modules that were loaded after the list was initialised. That's the reason the method does not currently use it and always looks at the live
Whatever we find need for: it's a general boolean object that evaluates You are right that if this was just about module options then it would be overkill, but there are other benefits to using a predicate that are not so obvious, but are potentially very powerful. Right now there is no way to answer the "what would sos do?" question other than to run it on a specific system and find out. With predicates we can force these globally, for one, some or all predicate options, and see exactly what sos would do in those circumstances. I see this as useful for testing, bug reproduction, and auditing, as well as finally implementing a long-requested |
Current upstream kernels will only load a protocol diag module if the underlying protocol module is loaded. As basic protocols like TCP, UDP, INET, netlink, unix, etc, are typically compiled in, then their diag modules are almost certainly always going to be loaded when 'ss' is invoked. But for example, you won't find sctp_diag loaded any more unless 'sctp' was already loaded. I appreciate the concern for leaving the system in the same state after running sosreport but with the proposed changes we'll almost never get this valuable 'ss' output unless some one happens to have ran ss on the system some time in the past. Also, the ipvs plugin runs ipvsadm commands which will trigger the loading of the ip_vs module if it is not already there. That could be fixed in the same way as the ip macsec case. |
This isn't our thing. It's a requirement from users (customers), and generally involves claims about auditing etc. We have to be able to leave the system in the same state that we found it - even if that means losing this data. If it is so vital, then support associates should be explaining the need - I'd be fine adding an option to override this on request, but a default run for something unrelated should not load these modules. |
Sorry, I wasn't arguing to ignore it; just lamenting the loss. If the modules are not seen to be loaded when the plugin runs, it theoretically would be safe to unload them after the fact as the kernel will auto-load them whenever an associated netlink request comes through from userspace. There is nothing stateful about the diag modules so even in the rare case some user (a regular user can run ss and load them) used them while the sosreport network plugin was running it should not matter if the plugin unloads them a moment later. But, as much as I'd like to have the 'ss' output in all cases, this solution seems ugly to me and I'm not going to advocate for it.
An option like this would definitely be appreciated. |
Yeah, it's an unfortunate tension that we're always forced to find a balance for. Sometimes these things go away naturally, as more modules become built-in, but I think we'll always have corner cases to handle.
I don't much like these solutions - it's inherently racey, and a bit "dodgy", even if the only side effect is ping-ponging modules in and out of the kernel (or trying to).
OK. That's definitely something we can do. In the meantime, I think it would be worth running this past our support colleagues again for an opinion. If there is a consensus that it's OK to have a way to prevent these commands running if not desired, but to still have them on by default, then I'd be OK with that and it's a bit closer to the ideal you'd like. |
Do the inverse. Run the audit-safe sos on request.
Yes, this. The majority of users are not asking for this feature. The majority of users do not audit their systems to the point they worry about which (safe, signed) kernel modules are loaded before and after system-provided troubleshooting commands are run. However, here's a likely usage pattern:
The majority of users will definitely be frustrated with being asked to provide "the same thing" twice. Write to the majority usecase. Detailed audit users can implement an organisation policy to "always run |
We had one suggestion that this could also be implented as a |
Is literally what I suggested in the following quoted paragraph :-D To be completely clear, what I am saying is: I am absolutely fine making this optional/non-default. No further discussion needed. I am also open to the idea of making it default/option-to-disable iff there is a consensus among our users (and especially support associates at the big commercial distros), and iff it passes all necessary reviews (including legal).
Once again, we are good, but we are not psychic. We find these things out by asking, or when people come to us and tell us. We also generally work to the princopal that support data collection should not modify the system. There is a long, long catalog of bugs (some with very angry users behind them), complaining about this type of behaviour. We have historically optimised for their wishes because they have been the vocal group, and it fits with our existing aims. If that's wrong, then we can work to change it, but we don't do so on a whim. This also isn't as simple and clear cut as just changing the default and walking away - we have for a very long time communicated (in documentation and our disclaimer text) that we will not modify system configuration. Making this a default and accepting that we will is a break from that. Perhaps it is as simple as inserting "persistent" into that phrase, or some similar tweak; we would need to go back to the legal team to understand the significance.
As an aside, I don't like this option name or what it implies. I think for the time being (until/unless we make some broader change in our disclaimer and what we communicate) that any option like this should be specific and targeted to one instance or class of data collection. |
We always try to map every
Also note that from 3.6 onwards you have access to the preset facility - this means that developers, distribution maintainers, and users can define named combinations of sos command line options to be recalled for later use. This is more flexible than the configuration file, since multiple presets can co-exist (and be combined with one another, or with manually entered command line options), and also allows for distributors to use their product specific knowledge to come up with combinations that are useful for particular products or scenarios (for e.g. the |
Presets may also be subject to configuration management by dropping JSON formatted files into
|
Cool, all is well :)
Nor do I, it was just an example placeholder. |
Please correct me if my understanding isn't right:
Just a question to @superjamie , @ptalbert , @battlemidget or any other kernel/networking domain expert: We work here with an assumption that if |
Running some commands can load kernel modules what disqualifies them from being called by default. Add an option that overrides this default behaviour. Related to: sosreport#1435 Signed-off-by: Pavel Moravec <[email protected]>
…oaded - "ip -s macsec show" requires "macsec" kmod loaded - "ss -peaonmi" requires 6 *_diag kernel modules Execute the commands only when the modules are loaded, or when explicitly requested via --allow-kmod-load option. Resolves: sosreport#1435 Signed-off-by: Pavel Moravec <[email protected]>
I decided to add explicit warning to the two commands if
to warn user we intentionally skip some commands (and to potentially let them re-run |
I would prefer if I haven't looked into situations where the |
As I keep explaining, this runs completely counter to what we have told our users for many years, as well as being a contradiction of our current disclaimer text. If you want to take on board changing that, I'd have no objection - but a side comment in a general PR is not the place for it. Start with an issue, then explore the legal issues, draft a new disclaimer, get it signed off etc. - the technical matters are not the big issue here. In the mean time I suggest your group communicate your needs to your support technicians and users more clearly up-front so that you can avoid receiving useless reports. To re-iterate, I have no problem reconsidering this long standing behaviour if that is truly desired, but this is not something we're going to change on a whim and without the proper ground work. |
The warning:
makes Travis tests to fail. Since Lines 455 to 466 in d0dcb39
Lines 29 to 30 in 6581488
@bmr-cymru and @TurboTurtle , isn't it worth to log warnings to stdout while errors and more severe logs to stderr? |
I'd agree - |
…oaded - "ip -s macsec show" requires "macsec" kmod loaded - "ss -peaonmi" requires 6 *_diag kernel modules Execute the commands only when the modules are loaded, or when explicitly requested via --allow-kmod-load option. Resolves: sosreport#1435 Signed-off-by: Pavel Moravec <[email protected]>
Even though I personally don't like the idea of changing the longstanding behavior of not changing the system at all, ever - I think we should perhaps change |
+1 for the option (re)name, will respin the PR now. About the users' preferences of sosreport's default behaviour: this must have been driven or requested by the community of users (to outweight the current undertaking "sosreport does not alter systems it runs on"). I raised this topic internally within Red Hat support, and majority responses agree with following this paradigm. This is not the voice of whole sosreport community, for sure. But it is the only feedback we have. |
Running some commands can change the system e.g. by loading a kernel modules. That disqualifies the commands from being called by default. Add an option that overrides this default behaviour. Related to: sosreport#1435 Signed-off-by: Pavel Moravec <[email protected]>
…oaded - "ip -s macsec show" requires "macsec" kmod loaded - "ss -peaonmi" requires 6 *_diag kernel modules Execute the commands only when the modules are loaded, or when explicitly requested via --allow-system-changes option. Resolves: sosreport#1435 Signed-off-by: Pavel Moravec <[email protected]>
Running some commands can change the system e.g. by loading a kernel modules. That disqualifies the commands from being called by default. Add an option that overrides this default behaviour. Related to: #1435 Signed-off-by: Pavel Moravec <[email protected]>
In sosreport#1435, --allow-system-changes option was added that is documented in sosreport --help but not in manpages. Resolves: sosreport#1850 Signed-off-by: Pavel Moravec <[email protected]>
In #1435, --allow-system-changes option was added that is documented in sosreport --help but not in manpages. Resolves: #1850 Signed-off-by: Pavel Moravec <[email protected]> Signed-off-by: Bryan Quigley <[email protected]>
See reproducer:
returns:
macsec
is loaded due to collecting ip -s macsec show
command output, the remaining 6 ones due toss -peaonmi
. Whileip -s macsec show
can be easily put under a condition "ifmacsec
module is loaded",ss
is bit more tricky, since running nakedss
can load up to 4 modules ([tcp|udp|inet|unix]_diag
).The text was updated successfully, but these errors were encountered: