[coredump] Refactor info and dump collection #3811

TurboTurtle · 2024-10-18T19:33:39Z

Refactor the plugin to adjust how coredumpctl info and coredump file collection is handled.

Plugin will now collect the compressed coredump file for coredump entries for which coredumpctl reports that the file is present, for the first X number of coredump files where X is the new dumps plugin option.

A second new plugin option, executable, can be used to specific a regex string to match coredump entries against to determine if they should be collected or not. The default behavior is to collect for all entries.

A symlink for coredump file collections will be dropped in the plugin directory to aide in associating dump files with coredumpctl entries, as it may not always be obvious for end users based on filenames alone.

Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

Is the commit message split over multiple lines and hard-wrapped at 72 characters?
Is the subject and message clear and concise?
Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
Does the commit contain a Signed-off-by: First Lastname [email protected]?
Are any related Issues or existing PRs properly referenced via a Closes (Issue) or Resolved (PR) line?
Are all passwords or private data gathered by this PR obfuscated?

packit-as-a-service · 2024-10-18T19:37:02Z

Congratulations! One of the builds has completed. 🍾

You can install the built RPMs by following these steps:

sudo yum install -y dnf-plugins-core on RHEL 8
sudo dnf install -y dnf-plugins-core on Fedora
dnf copr enable packit/sosreport-sos-3811
And now you can install the packages.

Please note that the RPMs should be used only in a testing environment.

TurboTurtle · 2024-10-18T19:37:50Z

So this one turned out more interesting than I thought. The dump size reported by coredumpctl is the on disk compressed size. Since the service uses zstd, this means that a seemingly innocuous 20mb file could inflate to over 1GB in size (which is what coredumpctl dump would produce). So there isn't really a way to anticipate collection sizes within the plugin.

From that, we can just collect the compressed coredumps, and then drop a symlink in the plugin dir to help users align what coredump they're looking for from the output of coredumpctl list, which isn't always straightforward. Not a standard practice by any means, but in this particular case I think it makes sense.

TurboTurtle · 2024-10-18T19:59:16Z

Ubuntu 18.04 failures are due to python version. We recently agreed to move to 3.8 as minimum version. cirrus.yaml should be updated shortly when the GCE image for the new Ubuntu dev release is available, per iRC:

15:50:13 @turboturt+| arif-ali: re: #3811 - I'm assuming the failures in the 18.04 test are due to python version (invalid syntax, this is the first commit using a := to the project) - is 18.04 a py3.6 by default     │
                    | release? If so, I think it's time to drop the 18.04 test                                                                                                                                           │
15:56:13   arif-ali | turboturtle: yup, walrus is 3.8, and bionic (18.04) should therefore be dropped. I have a PR ready locally to track new cirrusci.yaml changes to pick the new dev release which has been tagged but│
                    | the GCE image is due to be ready next week.

PR on hold til cirrus text matrix is updated.

pmoravec · 2024-10-31T16:46:15Z

sos/report/plugins/coredump.py

+                core = re.search(r"(^\s*Storage:(.*)(\(present\)))", res, re.M)
+                try:
+                    core_path = core.groups()[1].strip()
+                    self.add_copy_spec(core_path, tailit=False, sizelimit=100)


If we skip collecting the core due to its >100M size, will cores_collected be incremented? (ideally it shouldnt)

That's a good catch of a current blindspot in tailit handling. We don't actually give any indication to the caller that the collection was skipped. Will dig on this.

is it worth having an option that allows it to copy a bigger size? I had a coredump that was in 2GB in size recently, and we needed it. I think it would be worthwhile that as an option? thoughts?

I'm hesitant to add plugin options around size limiting. Though...what about extending log_size to a per-plugin setting similar to timeout?

pmoravec · 2024-10-31T16:47:16Z

sos/report/plugins/coredump.py


    plugin_name = "coredump"
    profiles = ('system', 'debug')
    packages = ('systemd-udev', 'systemd-coredump')

    option_list = [
-        PluginOpt("detailed", default=False,
-                  desc="collect detailed information for every report")
+        PluginOpt("dumps", default=5, desc="number of dump files to collect"),


Collecting 5 coredumps as a default? Isn't it too much?

I am bit inclined to 3 as the default, but no strong preference (and no real use case / user experience how many coredumps are worth to take).

5 was a shot in the dark number, no problem with dropping to 3.

Refactor the plugin to adjust how `coredumpctl info` and coredump file collection is handled. Plugin will now collect the compressed coredump file for coredump entries for which coredumpctl reports that the file is present, for the first X number of coredump files where X is the new `dumps` plugin option. A second new plugin option, `executable`, can be used to specific a regex string to match coredump entries against to determine if they should be collected or not. The default behavior is to collect for all entries. A symlink for coredump file collections will be dropped in the plugin directory to aide in associating dump files with coredumpctl entries, as it may not always be obvious for end users based on filenames alone. Signed-off-by: Jake Hunsaker <[email protected]>

TurboTurtle · 2024-11-04T15:50:15Z

Dropped number of cores from 5 to 3. Bumped the core size default to 200 since 100 can reasonably be considered small for coredumps.

There's not a great way to detect skips from a_c_s today, so inserted a stat to handle not incrementing the counter for when we will skip a core path. Noted a TODO for extending log-size to a per-plugin option. I'll handle that probably by EOM, but it is a somewhat involved logical dance if the timeout handling is any indication.

TurboTurtle force-pushed the coredump-refactor branch 2 times, most recently from e6e9965 to 1e5a263 Compare October 18, 2024 19:57

arif-ali mentioned this pull request Oct 26, 2024

[testing] Update Ubuntu images #3824

Merged

6 tasks

TurboTurtle force-pushed the coredump-refactor branch from 1e5a263 to 346789e Compare October 27, 2024 00:42

pmoravec reviewed Oct 31, 2024

View reviewed changes

arif-ali mentioned this pull request Nov 3, 2024

[collect] Fix exception while parsing sos help #3830

Merged

6 tasks

TurboTurtle force-pushed the coredump-refactor branch from 346789e to c8e9655 Compare November 4, 2024 15:48

pmoravec approved these changes Nov 4, 2024

View reviewed changes

arif-ali approved these changes Nov 6, 2024

View reviewed changes

TurboTurtle merged commit 99d0ae2 into sosreport:main Nov 6, 2024
33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[coredump] Refactor info and dump collection #3811

[coredump] Refactor info and dump collection #3811

TurboTurtle commented Oct 18, 2024

packit-as-a-service bot commented Oct 18, 2024

TurboTurtle commented Oct 18, 2024

TurboTurtle commented Oct 18, 2024

pmoravec Oct 31, 2024

TurboTurtle Nov 1, 2024

arif-ali Nov 1, 2024

TurboTurtle Nov 1, 2024

pmoravec Oct 31, 2024

TurboTurtle Nov 1, 2024

TurboTurtle commented Nov 4, 2024

[coredump] Refactor info and dump collection #3811

[coredump] Refactor info and dump collection #3811

Conversation

TurboTurtle commented Oct 18, 2024

packit-as-a-service bot commented Oct 18, 2024

TurboTurtle commented Oct 18, 2024

TurboTurtle commented Oct 18, 2024

pmoravec Oct 31, 2024

Choose a reason for hiding this comment

TurboTurtle Nov 1, 2024

Choose a reason for hiding this comment

arif-ali Nov 1, 2024

Choose a reason for hiding this comment

TurboTurtle Nov 1, 2024

Choose a reason for hiding this comment

pmoravec Oct 31, 2024

Choose a reason for hiding this comment

TurboTurtle Nov 1, 2024

Choose a reason for hiding this comment

TurboTurtle commented Nov 4, 2024