[redhat|policy] Provide a URL message for S3 bucket based uploads #3891

TrevorBenson · 2024-12-23T22:01:20Z

Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

Is the commit message split over multiple lines and hard-wrapped at 72 characters?
Is the subject and message clear and concise?
Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
Does the commit contain a Signed-off-by: First Lastname [email protected]?
Are any related Issues or existing PRs properly referenced via a Closes (Issue) or Resolved (PR) line?
Are all passwords or private data gathered by this PR obfuscated?

Closes #3890

packit-as-a-service · 2024-12-23T22:03:54Z

Congratulations! One of the builds has completed. 🍾

You can install the built RPMs by following these steps:

sudo yum install -y dnf-plugins-core on RHEL 8
sudo dnf install -y dnf-plugins-core on Fedora
dnf copr enable packit/sosreport-sos-3891
And now you can install the packages.

Please note that the RPMs should be used only in a testing environment.

Signed-off-by: Trevor Benson <[email protected]>

jcastill

LGTM.
Did a very quick test with non-S3 uploads and worked as expected. I'll test more thoroughly in a couple of days

TurboTurtle

The code LGTM, but I wonder if we wouldn't be better served if we instead change the logic that was generating the original problematic messaging in #3890. It would strike me as a little strange for the RH api path the be appended on to whatever endpoint/bucket path I'd otherwise be expecting.

TurboTurtle · 2024-12-24T15:30:52Z

sos/policies/distros/redhat.py

+                rh_case_api = "/support/v1/cases/%s/attachments"
+                return f"{endpoint}/{bucket}" + rh_case_api % self.case_id


Let's make this all f-string formatting, mixing the two syntax is a little awkward.

I'll adjust this accordingly once I have additional feedback to #3891 (comment)

TrevorBenson · 2024-12-24T19:20:24Z

The code LGTM, but I wonder if we wouldn't be better served if we instead change the logic that was generating the original problematic messaging in #3890.

Unless I overlooked something this function is what ends up generating both parts of the message: the missing case id; the upload to Red Hat Secure FTP.

It would strike me as a little strange for the RH api path the be appended on to whatever endpoint/bucket path I'd otherwise be expecting.

Only included this as I think @jcastill mentioned RH potentially using an S3 bucket in the future. I presume he may be adding an RH_S3_HOST (aka an S3 Endpoint) in the work he is doing separately.

I want to be sure I understand your request. When both a case id and a bucket prefix are provided the resulting sos-collector tar archive will contain the case id, but the archive itself is only found inside the provided s3 object prefix? Or are you suggesting we drop the rh_case_api altogether and if the user provides a case id without providing an object prefix the tar file, with a case ID in its name, ends up in the root of the bucket?

Thanks in advance for any clarification.

TurboTurtle · 2024-12-26T18:05:39Z

Let me answer this way, please let me know if I'm not making sense here.

As an end-user creating an sos report and uploading it to an S3 location that I've defined in my config file or in the sos command I'm running, I would expect:

No message about SFTP, regardless of it a case id was provided or not
No message about a Red Hat location
The sos report to end up in the endpoint/bucket I defined, and not nested under a path I did not specify.

On that last point that means I'd expect, with or without a case id, this path from #3890:

s3://project-sosreports/ba59c727-bfd2-4808-8f5f-f62c63903c13/2024/12 or, potentially, s3://project-sosreports/ba59c727-bfd2-4808-8f5f-f62c63903c13/2024/12/$case_id

and not

s3://project-sosreports/support/v1/cases/12345/attachments

If RH starts using an S3 bucket, we'd address any handling of that within the Red Hat upload methods so that it doesn't impact a user uploading to a non-RH S3 location in any way. We really need to separate upload from policy, and I'll try and spend some time on that this week and next - but for the purposes of this PR I'm mainly just concerned about the above - predictable and expected locations within a user-specified S3 destination.

TrevorBenson · 2024-12-26T19:44:35Z

Makes sense and clarifies the case of no prefix with a case id, which will result in the path: s3://project-sosreports/$case_id.

I'll make the adjustments and force push a new commit for review.

TrevorBenson · 2024-12-28T00:00:22Z

I thought providing an https:// based upload_url that includes the endpoint would be more informative. On further review I see why I used the s3:// url instead:

sos/sos/policies/distros/__init__.py

Lines 547 to 567 in ace5715

    
               def _determine_upload_type(self): 
        
                   """Based on the url provided, determine what type of upload to attempt. 
        
                   Note that this requires users to provide a FQDN address, such as 
        
                   https://myvendor.com/api or ftp://myvendor.com instead of 
        
                   myvendor.com/api or myvendor.com 
        
                   """ 
        
                   prots = { 
        
                       'ftp': self.upload_ftp, 
        
                       'sftp': self.upload_sftp, 
        
                       'https': self.upload_https, 
        
                       's3': self.upload_s3 
        
                   } 
        
                   if self.commons['cmdlineopts'].upload_protocol in prots: 
        
                       return prots[self.commons['cmdlineopts'].upload_protocol] 
        
                   if '://' not in self.upload_url: 
        
                       raise Exception("Must provide protocol in upload URL") 
        
                   prot, _ = self.upload_url.split('://') 
        
                   if prot not in prots: 
        
                       raise Exception(f"Unsupported or unrecognized protocol: {prot}") 
        
                   return prots[prot]

The upload_s3 method doesn't rely on upload_url currently, so get_upload_url returning a string w/ case_id appended shows the Attempting upload to correctly, but its dropped by the upload process.

While appending the case_id to the object prefix seemed a simple fix to this I found on a freshly installed rhel 8.9 system get_upload_url() method of RHELPolicy is called multiple times, or recursively. This lead to the case_id being appended more than once when appending it directly to the s3 object prefix instead of the upload_url.

[RHELPolicy.get_upload_url] called
[RHELPolicy.get_upload_url] called
[RHELPolicy.get_upload_url] called
[RHELPolicy.get_upload_url] called
Attempting upload to https://sosreport.example.com/project-sosreports/ba59c727-bfd2-4808-8f5f-f62c63903c13/2024/12/12345/12345

I'm going to convert this PR to a draft while determining how best to implement the requests and if aligning s3 more closely to the other upload protocols logic is possible.

TrevorBenson force-pushed the sos-trevorbenson-upload-url-string branch from 5e10fe9 to e9deb52 Compare December 23, 2024 22:09

[redhat|policy] get_upload_url return for s3 protocol

c1fe344

Signed-off-by: Trevor Benson <[email protected]>

TrevorBenson force-pushed the sos-trevorbenson-upload-url-string branch from e9deb52 to c1fe344 Compare December 23, 2024 22:10

jcastill added Status/RedHat QE RH QE has been requested to review Status/RedHat Eng RedHat Engineering has been requested to review Kind/RedHat RedHat related item labels Dec 24, 2024

jcastill approved these changes Dec 24, 2024

View reviewed changes

jcastill added the Reviewed/Other Ack Acknowledged by a member label Dec 24, 2024

TurboTurtle reviewed Dec 24, 2024

View reviewed changes

TrevorBenson marked this pull request as draft December 28, 2024 00:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[redhat|policy] Provide a URL message for S3 bucket based uploads #3891

[redhat|policy] Provide a URL message for S3 bucket based uploads #3891

TrevorBenson commented Dec 23, 2024

packit-as-a-service bot commented Dec 23, 2024

jcastill left a comment

TurboTurtle left a comment

TurboTurtle Dec 24, 2024

TrevorBenson Dec 26, 2024

TrevorBenson commented Dec 24, 2024 •

edited

Loading

TurboTurtle commented Dec 26, 2024

TrevorBenson commented Dec 26, 2024

TrevorBenson commented Dec 28, 2024

		rh_case_api = "/support/v1/cases/%s/attachments"
		return f"{endpoint}/{bucket}" + rh_case_api % self.case_id

[redhat|policy] Provide a URL message for S3 bucket based uploads #3891

Are you sure you want to change the base?

[redhat|policy] Provide a URL message for S3 bucket based uploads #3891

Conversation

TrevorBenson commented Dec 23, 2024

packit-as-a-service bot commented Dec 23, 2024

jcastill left a comment

Choose a reason for hiding this comment

TurboTurtle left a comment

Choose a reason for hiding this comment

TurboTurtle Dec 24, 2024

Choose a reason for hiding this comment

TrevorBenson Dec 26, 2024

Choose a reason for hiding this comment

TrevorBenson commented Dec 24, 2024 • edited Loading

TurboTurtle commented Dec 26, 2024

TrevorBenson commented Dec 26, 2024

TrevorBenson commented Dec 28, 2024

TrevorBenson commented Dec 24, 2024 •

edited

Loading