Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[4.0] Initial tree reorganization and new binary implementation #1993

Closed
wants to merge 44 commits into from

Conversation

TurboTurtle
Copy link
Member

@TurboTurtle TurboTurtle commented Mar 27, 2020

This is the initial proposal for the reorganization of the project tree, and the introduction of the new sos binary as the command of choice for running sos.

In no particular order, here are some highlights that deserve attention:

  • This includes [Plugin] Add iterator for block devices, update relevant plugins, and remove add_udev_info  #1508 and [Policy|Plugin] Provide a Container Runtime Abstraction #1873. This is done now to make the merging process easier, and to close out two of the goals highlighted in the 4.0 RFC.

  • Policy initialization is now done after logging setup. I did this because in previous Policy() work I have found it frustrating that logging was not available. This plays into the request for [policies] Detect the RedHat OS variant using 'ID' field #1934.

    • As far as I could tell, the only blocker on doing this was policy-defined tmp_dir, which is needed to setup logging within the temp directory created for sos. I have tentatively set this to be /var/tmp if one is not specified by --tmp-dir by the user. This appears to be a change for Ubuntu, away from /tmp, and so I'd like @battlemidget and @BryanQuigley to weigh in on if that is acceptable or if that poses some major issue for that distro line.
  • Subcommands can be added by subclassing SoSComponent from sos/component.py. Report has been updated to make use of this mechanism. Subcommands must also be explicitly added in sos/__init__.py/SoS() handling to be supported. I wanted to make the tree as clean as possible, and dynamic discovery and loading of subcommands proved to be at odds with that.

  • SoSOptions required a lot of rework. Most of the functionality remains the same however, with two major exceptions.

    • Arguments must now be specified by anything that subclasses SoSComponent in the arg_defaults class attribute. This is a dict of key/value pairings for any options that subcommand implements. This replaces the previous design where arg_defaults was statically set for SoSOptions.
    • Loading from a config file is handled slightly differently now. Rather than merge against cmdline args, the conf file values are updated directly against the "main" SoSOptions instance that gets used and morphed into self.opts for any SoSComponent.
  • The interpreter has been set to python3, in order to kickstart the python3-only development mindset for 4.0. I am not sure if it is still functional if you manually shove it through a python2 interpreter, and I have not tested doing so.

    • I fully expect the Travis py2 tests to fail
    • I have not yet done much de-python2-ing, though there are a sprinkle of those changes in the tree.
  • Do not expect make or setuputils to work at this time. Packaging will be addressed further into the 4.0 development cycle.

  • There are a lot of commits in this PR. I tried to condense where I could and where it made sense, however with how involved this work is I think it's reasonable to say that any developer would assume that individual commits can be breaking until the end of this patchset.


Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

  • Is the commit message split over multiple lines and hard-wrapped at 72 characters?
  • Is the subject and message clear and concise?
  • Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
  • Does the commit contain a Signed-off-by: First Lastname [email protected]?
  • If this commit closes an existing issue, is the line Closes: #ISSUENUMBER included in an independent line?
  • If this commit resolves an existing pull request, is the line Resolves: #PRNUMBER included in an independent line?

@lgtm-com
Copy link

lgtm-com bot commented Mar 27, 2020

This pull request introduces 1 alert when merging 6baace7 into a240961 - view on LGTM.com

new alerts:

  • 1 for Unused import

@BryanQuigley
Copy link
Contributor

/var/tmp vs /tmp
We definitely do not want sosreports surviving a reboot by default - I'd be worried about slowly filling up disks. Why do you want them to?

I was not expecting the commands to move to bin/. I'm not strongly against it, just wondering why?

I somewhat feel we should have just reviewed and committed #1508/#1873 separately first. Why does this make it easier?

@TurboTurtle
Copy link
Member Author

TurboTurtle commented Mar 27, 2020

/var/tmp vs /tmp
We definitely do not want sosreports surviving a reboot by default - I'd be worried about slowly filling up disks. Why do you want them to?

There are a few utilities that will collect an sosreport today, e.g. abrt. For one example, say abrt generates an sosreport and then a system experiences a kernel panic - you have now lost the pre-reboot sosreport.

There's also a not-uncommon pattern of asking for an sosreport before and after some upgrade operation that includes a reboot for a kernel upgrade. Keeping them in place through reboots means less hassle for the end user in terms of number of trips to retrieve the archives.

I was not expecting the commands to move to bin/. I'm not strongly against it, just wondering why?

That's just the location in the tree. Packaging would still place the binaries wherever they need to go. I.E. if you're running from the git checkout, yes you'd need to invoke using ./bin/sos, but when packaged and installed for a particular distribution you'd still just call sos as the packaging should drop them in the proper location in $PATH.

They need to be in different directory in the tree since we obviously can't have an sos binary and directory at the same location.

I somewhat feel we should have just reviewed and committed #1508/#1873 separately first. Why does this make it easier?

These are both relatively big changes to Plugin/Policy, that don't really make sense in either a maintenance release or a mid-release commit. Since they weren't included in 3.9, they pushed to 4.0 (where they were included in the RFC). And by easier I mean, it's better to include the planned Plugin changes before we move the tree around so that there is less rebasing that needs to happen.

@BryanQuigley
Copy link
Contributor

BryanQuigley commented Mar 27, 2020

They need to be in different directory in the tree since we obviously can't have an sos binary and directory at the same location.

That's a good reason :).

There are a few utilities that will collect an sosreport today ... you have now lost the pre-reboot sosreport.
If a utility is managing sosreport generation, it makes sense to me that it would also manage sosreport cleaning - but maybe it should specify a specific app specific location for the sosrepot?

We did have a discussion internally about some customers wanting to have sosreports generated on a regular basis - we would have that tool manage what ones need to be cleaned and when.

Do you have anything that help keep /var/tmp clean?

@TurboTurtle
Copy link
Member Author

Do you have anything that help keep /var/tmp clean?

We don't have anything specific for it to my knowledge.

@pmoravec
Copy link
Contributor

Do you have anything that help keep /var/tmp clean?

We don't have anything specific for it to my knowledge.

Maybe systemd-tmpfiles ?

@lgtm-com
Copy link

lgtm-com bot commented Mar 27, 2020

This pull request introduces 1 alert when merging 726d728 into a240961 - view on LGTM.com

new alerts:

  • 1 for Unused import

@TurboTurtle
Copy link
Member Author

Realized there was a missing fixup when compared to my local branch, so I've repushed that after squashing it into the set.

I'll fixup the LGTM alert when I inevitably make another push after getting feedback on this.

@bmr-cymru
Copy link
Member

We use /var/tmp in the Red Hat policies because of tmp-on-tmpfs. Other policies are free to do as they wish based on their distro's default configuration (abrt-generated reports should end up in their own directory anyway).

@TurboTurtle
Copy link
Member Author

I've updated the actual sosreport runs in simple.sh to use the new binary, and those tests seem to be passing now. Updating setup.py will take care of the rest of the tests, as we just need to update the packages section to match the new layout. Holding out on that change for the moment until this initial review is done and agreed upon.

Copy link
Contributor

@BryanQuigley BryanQuigley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with the sosreport arg fix

@TurboTurtle
Copy link
Member Author

Re-pushed with this fixup change to have the sosreport redirection not break

         # load values from cmdline
-        cmdopts = SoSOptions().from_args(self.parser.parse_args())
+        cmdopts = SoSOptions().from_args(self.parser.parse_args(self.cmdline))

@pmoravec
Copy link
Contributor

pmoravec commented Apr 4, 2020

Few comments (that I might add more during this weekend):

  • trivial subtle bug: calling bin/sos without an argument / component raises:
# PYTHONPATH=. /usr/bin/python3 ./bin/sos
Unknown subcommand 'None' specified
Could not initialize 'None': None
#
  • bump version to 4.0 should be deferred to a commit just before the release (though this will be surely the goal of the release)
  • Testing on RHEL8, I need to set PYTHONPATH explicitly to . to let the sos[report] binary to import SoS from sos:
# /usr/bin/python3 ./bin/sos report
Traceback (most recent call last):
  File "./bin/sos", line 15, in <module>
    from sos import SoS
ImportError: cannot import name 'SoS'
# /usr/bin/python3 ./bin/sosreport
Traceback (most recent call last):
  File "./bin/sosreport", line 20, in <module>
    from sos import SoS
ImportError: cannot import name 'SoS'
#

So I have to call PYTHONPATH=. /usr/bin/python3 ./bin/sos report - this sounds strange to me esp. when I see the sys.path.append(os.getcwd()) in https://github.com/sosreport/sos/pull/1993/files#diff-47225f38aba72b212f4a92e1bfcc8021R14-R15


for container in containers:
self.add_cmd_output("podman inspect %s" % container,
subdir='containers')

for img in images:
name, img_id = img.strip().split()
name, img_id = img
insp = name if 'none' not in name else img_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not related to the change itself, but either way: what if a container name is e.g. danone, i.e. containing none string? I think it does not matter much, as img_id is always set (am I right here?). So the only impact would be less human-readable file podman_inspect_<id> will be stored instead of podman_inspect_<name>.

Copy link
Member Author

@TurboTurtle TurboTurtle Apr 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are situations where podman (and docker for that matter) will report an image name as a string of just 'none' or "<none>", so we could potentially make this a more exact match if needed. However, yes the image ID is always present, which is why we fallback like we do here.

"nvme smart-log /dev/%s" % dev,
"nvme error-log /dev/%s" % dev,
"nvme show-regs /dev/%s" % dev,
"nvme get-ns-id /dev/%s" % dev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command is missing in cmds. Is that intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - we talked about that in #1508 too. It's duplicate data.

Copy link
Contributor

@pmoravec pmoravec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think docs/plugins.rst and setup.py shall replace sos.plugins to sos.report.plugins as well, but definitely not required now (likewise Makefile changes). Just a note for further reference/check.

@TurboTurtle
Copy link
Member Author

Few comments (that I might add more during this weekend):

  • trivial subtle bug: calling bin/sos without an argument / component raises:
# PYTHONPATH=. /usr/bin/python3 ./bin/sos
Unknown subcommand 'None' specified
Could not initialize 'None': None
#

Hmm, I would expect argparse to catch that and print the help message. Will dig into that to see if there's a way to force that.

  • bump version to 4.0 should be deferred to a commit just before the release (though this will be surely the goal of the release)

Fair point, I'll drop that commit.

  • Testing on RHEL8, I need to set PYTHONPATH explicitly to . to let the sos[report] binary to import SoS from sos:
# /usr/bin/python3 ./bin/sos report
Traceback (most recent call last):
  File "./bin/sos", line 15, in <module>
    from sos import SoS
ImportError: cannot import name 'SoS'
# /usr/bin/python3 ./bin/sosreport
Traceback (most recent call last):
  File "./bin/sosreport", line 20, in <module>
    from sos import SoS
ImportError: cannot import name 'SoS'
#

So I have to call PYTHONPATH=. /usr/bin/python3 ./bin/sos report - this sounds strange to me esp. when I see the sys.path.append(os.getcwd()) in https://github.com/sosreport/sos/pull/1993/files#diff-47225f38aba72b212f4a92e1bfcc8021R14-R15

That is strange - it appears that python 3.6.8 on RHEL 8 is not attempting the last location in sys.path (our appended cwd). Because if I change this to sys.path.insert(0, os.getcwd()) it works as expected.

I'll raise a RHBZ for this to ask what the maintainers team for the python interpreter suggest.

Shouldn't be an issue, I'll swing back around to that one.

@TurboTurtle
Copy link
Member Author

TurboTurtle commented Apr 4, 2020

The more I think about the sys.path issue, the more I think using append() instead of insert() is wrong, and that we should just be doing an insert(0, os.getcwd()) anyways in order to force the use of the cwd-relative modules over a potential system installation when running from the checkout.

EDIT: Oh, derp. This is actually obvious what's going on. You have a locally installed (older) version of sos, that is superseding the checkout path for the module imports. In other words, yes we need to change the append to an insert for all purposes.

@pmoravec
Copy link
Contributor

pmoravec commented Apr 5, 2020

Parsing from /etc/sos.conf stores int values as str. In particular, having /etc/sos.conf:

[general]
log-size = 15

then most or all plugins raise an uncaught exception:

# /usr/bin/python3 ./bin/sos report -o hardware --batch -vvv
..
[sos.report:setup] executing 'sos report -o hardware --batch -vvv'
[sos.report:setup] using 'rhel' preset defaults ()
[sos.report:setup] effective options now: --batch --log-size 15 --onlyplugins hardware -vvv
 Setting up plugins ...
caught exception in plugin method "hardware.setup()"
caught exception in plugin method "hardware.setup()"
writing traceback to sos_logs/hardware-plugin-errors.txt
writing traceback to sos_logs/hardware-plugin-errors.txt

The traceback:

Traceback (most recent call last):
  File "/root/sos-4.0/sos/sos/report/__init__.py", line 873, in setup
    plug.setup()
  File "/root/sos-4.0/sos/sos/report/plugins/hardware.py", line 28, in setup
    "/sys/class/drm/*/edid"
  File "/root/sos-4.0/sos/sos/report/plugins/__init__.py", line 1117, in add_copy_spec
    if sizelimit and current_size > sizelimit:
TypeError: '>' not supported between instances of 'int' and 'str'

Similar problem happens for threads = 3 in sos.conf.

The cause is on line https://github.com/sosreport/sos/pull/1993/files#diff-7e7bbf33e24f9243b70a0371399d2fe4R166 where setattr should be replaced by a kind of self._merge_opt (but I failed to figure out details). Simply, the setattr overrides (int)4 to (str)'3' for threads = 3 in sos.conf.

@pmoravec
Copy link
Contributor

pmoravec commented Apr 5, 2020

First, thanks for such prompt answers and fixes (and for the PR in general, of course! a tremendous amount of work).

  • speaking about /etc/sos.conf I think we should "modularize" it as well (not now, e.g. once merging another module). Nowadays it sets tunables for sos report only. Maybe to have /etc/sos.conf for sos generic options (the few common ones), and current file to move to /etc/sos.modules.d/report.conf?
  • ./bin/sos rep stopped working (I bet it worked yesterday) - see my comment in code "We need to check against aliases")
  • should not sos.policies be moved to sos.report.policies (and sos/policies dir under sos/report/policies)? Or will be Policy derived objects applicable outside sos report?

(no further review points from me, yay!)

@TurboTurtle
Copy link
Member Author

@pmoravec nice catch! I thought I had caught the type casting, but apparently was overlooking it for the config file. I'm pushing a branch update that does explicit checks against the default type as defined by the component loading options into SoSOptions. I too tried to use _merge_opts(), but wasn't able to get it to bend in both directions at once for our needs.

./bin/sos rep stopped working (I bet it worked yesterday)

Interesting. The way I understand aliases to work is that parse_args() should translate any of the aliases back to the "formal" name, I.E. when we check the component option after parse_args(), it should be set to report even when the user specifies rep. That doesn't seem to be happening now, and I'm not sure why.

should not sos.policies be moved to sos.report.policies (and sos/policies dir under sos/report/policies)? Or will be Policy derived objects applicable outside sos report?

My current view is that sos-collector integration will use policies as well.

@TurboTurtle
Copy link
Member Author

Ok, I've done a bit more digging. And this exact topic is a matter of (recurring) debate in upstream python - the prevailing recommendation right now though is to use set_defaults() on the subparsers specifying the aliases to point back to the "normal" name/value. I've update the branch with that, and it seems to be working as desired.

$ sudo ./bin/sos rep

sosreport (version 3.9)

This command will collect system configuration and diagnostic
information from this Fedora system.

[...]
$ sudo ./bin/sos report

sosreport (version 3.9)

This command will collect system configuration and diagnostic
information from this Fedora system.

@TurboTurtle
Copy link
Member Author

As an aside, what are the thoughts on the term component, in both the SoSComponent aspect and the --help messages? Any objections? Better suggestions? We do already have a SoSCommand class, but that can easily be renamed if desired - it's only used within Plugin as it was the replacement for the tuple previously used to store the command being run with kwargs from _add_cmd_output().

This is the first patch to add a new `sos` binary to act as the new
commandline interface for users.

It adds the basic `SoS()` and `SoSComponent()` classes that will be
built out and used going forward for new subcommand development. Actual
functionality will be added in coming patches, as this is mainly serving
as a checkpoint to avoid over-large commits affecting wide swathes of
changes at once.

Related: sosreport#1986

Signed-off-by: Jake Hunsaker <[email protected]>
Moves `SoSOptions()` and the related bits to a new sos/options.py
location. This should allow for easier maintenance and ongoing
development as our option handling approach will be shared across
components.

Signed-off-by: Jake Hunsaker <[email protected]>
Signed-off-by: Jake Hunsaker <[email protected]>
Now dynamically build the usage string reported by --help for the
top-level parser (I.E. `sos --help`) to print available components.

Additionally, rework the subparser usage strings to accurately report
the syntax for subcommands.

Signed-off-by: Jake Hunsaker <[email protected]>
This is the first of probably a number of commits aimed at making
SoSOptions() a generic option handling class for all components in sos,
rather than being tightly bound to `report`.

Signed-off-by: Jake Hunsaker <[email protected]>
Moves logging setup into `SoSComponent`, and logging is setup by default
at component initialization, unless the `configure_logging` class attr
is set to `False`.

Moves the TempFileUtil() class out of report and into utilities for
easier usage by all sos components.

Signed-off-by: Jake Hunsaker <[email protected]>
As we are moving to decouple SoSOptions() from report, we need to adjust
how some of the option loading is handled, such as loading values from a
conf file. The previous approach was to instantiate a new SoSOptions()
object, read into it from the config file, and then merge that with the
"main" SoSOptions() object. This worked well when we only had report to
worry about since the arguments were more or less static.

However, in allowing new subcommands to define their own options this
presented a problem due to the behavior of merging. This was compounded
by the fact that merging still functioned how we expected/needed for
actions like merging commandline options against the defaults.

As such, the new approach is to use the "main" SoSOptions() instance to
read the config file specifications and then directly modify the needed
values rather than relying on the merging behavior.

Signed-off-by: Jake Hunsaker <[email protected]>
Adds the exit handling from SoSReport() directly into SoSComponent()

Signed-off-by: Jake Hunsaker <[email protected]>
Adds Policy loading into `SoSComponent()` so that any subcommand can, if
it needs it, have policies loaded consistenly. This is controlled via
the `load_policy` class attr, which defaults to True. Future subcommands
that do not need to have a policy loaded (E.G. the planned `info`
command), can simply set this to False in order to skip this step.

Moves `SoSComponent()` out of sos/__init__.py and into sos/component.py
to ease import conflicts arising from policy loading.

Signed-off-by: Jake Hunsaker <[email protected]>
Moving the `_sos_ definition wrapper for gettext into utilities proved
to be the wrong direction, so this moves it back into sos/__init__.py in
order to preserve the ability to properly import it elsewhere.

Signed-off-by: Jake Hunsaker <[email protected]>
For the 3.x line, many classes subclassed `object` directly, in order to
use the 'new-style' objects introduced with python3, while still
maintaining python2 functionality. As sos-4.0 is python3 only, we no
longer need to do this.

Signed-off-by: Jake Hunsaker <[email protected]>
Signed-off-by: Jake Hunsaker <[email protected]>
Moves SoSReport() from a standalone class to a subclass of
SoSComponent() in order to take advantage of the new 4.0 redesign.

This commit also pulls out methods that were previously transitioned to
other places, e.g. logging setup moving from SoSReport() to the base
SoSComponent() class.

Signed-off-by: Jake Hunsaker <[email protected]>
For python3, calling `.keys()` on a dict no longer returns a list, so we
need to explicitly make it a list of values to behave as expected.

Signed-off-by: Jake Hunsaker <[email protected]>
Removes an extraneous setting of self.sys_tmp, which is now handled via
`SoSComponent`. This was getting overridden to `None` which was causing
the final archive creation to fail.

Signed-off-by: Jake Hunsaker <[email protected]>
When loading from a config file, ensure that verbosity is set to and
integer rather than a string.

Signed-off-by: Jake Hunsaker <[email protected]>
Adds the `desc` class attr to SoSReport() for use by the help text when
`sos --help` is run.

Signed-off-by: Jake Hunsaker <[email protected]>
Since sos-4.0 is py3 only, set the interpreset of the binary as such

Signed-off-by: Jake Hunsaker <[email protected]>
We need to carry a `sosreport` binary for some time to allow for end
users and downstreams to adjust to the new binary. It was hoped
originally that the old `sosreport` binary could be maintained in place
and simply provide the older set of functionality.

This has proven to not be possible givent he overhaul of the options
handling that allows us to have multiple subcommands. So while we will
still ship an `sosreport` binary, and it will be locked to `report`
functionality, it is now a simple redirection script that also makes the
user aware of the new `sos` binary.

Closes: sosreport#1986

Signed-off-by: Jake Hunsaker <[email protected]>
Moves the HTML/JSON reporting bits under sos/report.

Signed-off-by: Jake Hunsaker <[email protected]>
Adds support to the parser to allow aliases for subcommands in the
parser. At the moment, this only applies to `report`, with the 'rep'
shorthand. This is however intended for further 4.0 use via the
integration of sos-collector and soscleaner. For example, invoking 'sos
clean' and 'sos cleaner' would both be possible.

Signed-off-by: Jake Hunsaker <[email protected]>
Consolidated PEP8 fixes following reorganization and redeisgn work.

Signed-off-by: Jake Hunsaker <[email protected]>
Updates the test script `simple.sh` used by Travis to use the new `sos`
binary with the `report` subcommand.

Signed-off-by: Jake Hunsaker <[email protected]>
Removes an extraneous import identified by LGTM

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle
Copy link
Member Author

Ok, trivial fixups have been merged and this PR has been rebased on current master.

We have acks from both the Red Hat and Canonical sides, so I'm viewing this as good-to-go in its current state. However, I will leave this PR open for further comments until tomorrow morning (-ish) for any last minute critiques. Should there be none, I'll merge at that point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants