-
Notifications
You must be signed in to change notification settings - Fork 79
Fix s3 uri format so that its explicitly a s3 URI #2060
Conversation
The hashicorp go-getter lib that Nomad uses internally doesn't recognize the http:// uri as a s3 uri for some reason. As such, I'm updating the format to be explicitly a s3 uri. This fixes the plugin task when run in AWS I've removed the comments about the slack thread, since that thread has been deleted (past the 90 day threshold or whatever it is)
Codecov ReportBase: 39.37% // Head: 39.39% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #2060 +/- ##
==========================================
+ Coverage 39.37% 39.39% +0.01%
==========================================
Files 436 436
Lines 10154 10154
==========================================
+ Hits 3998 4000 +2
+ Misses 6156 6154 -2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
@@ -1,13 +1,11 @@ | |||
use grapl_config::env_helpers::ENV_ENDPOINT; | |||
|
|||
/// Some discussion about this function here: | |||
/// https://grapl-internal.slack.com/archives/C02J5JYS92S/p1648667074625629 | |||
pub fn get_s3_url(bucket: &str, key: &str) -> String { | |||
let local_endpoint = std::env::var(ENV_ENDPOINT).ok(); | |||
// If the above is specified, we're running locally against Localhost | |||
// Otherwise, it's against prod S3; so we can use virtual-hosted-style URLs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment looks like it needs updating now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Unfortunately, since I can't access the slack message, the comment I've added may be incomplete but 🤷 .
Let me know if there's something you think I should add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was mainly referring to the fact that it's not an S3 virtual-host-style URL anymore (and not even a URL, or even a URI, I think 🤷)
If it's all about how go-getter
handles things, then a mention of that and any relevant links would be useful.
Without that kind of background information, I could easily see someone in the future "helpfully" refactoring this into something that doesn't work anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated docs again. This should be better
pub fn get_s3_url(bucket: &str, key: &str) -> String { | ||
let local_endpoint = std::env::var(ENV_ENDPOINT).ok(); | ||
// If the above is specified, we're running locally against Localhost | ||
// Otherwise, it's against prod S3; so we can use virtual-hosted-style URLs | ||
local_endpoint.map_or_else( | ||
|| format!("http://{bucket}.s3.amazonaws.com/{key}"), | ||
|| format!("s3::{bucket}.s3.amazonaws.com/{key}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this a URL? I would buy s3://...
but this just looks weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah let me change the function name to URI. And agreed that this is a very weird schema. It feels like someone was like, well our arns already use ::
as separators, why not just reuse that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, would it have worked with https://
instead of http://
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly no. I tried that early on 😿
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So apparently s3://
does work and is officially supported by AWS. I've switched to that, especially since this is the format the s3 URI link in the console is (https://s3.console.aws.amazon.com/s3/object/plugin-registry-bucket-b1abfd2?region=us-east-1&prefix=plugins/tenant_id_024885b9-dcf0-469d-b12a-81876cc045c8/plugin_type-analyzer/51f8b573-6b7c-411c-9379-2b1070205c1f.bin and search s3 URI).
Apparently, the s3::
was something the go-getter lib created and call Forced Protocols. Basically its a way of saying that this is a s3 url.
Added a comment explaining what this function is and what its used by to replace the comment that linked to a deleted slack comversation. I've also renamed everything from url to uri to be more explicit about what to expect.
pub fn get_s3_url(bucket: &str, key: &str) -> String { | ||
/// This gets a s3 URI to be used by Nomad's artifact stanza (in particular the grapl-plugin). | ||
/// The underlying go-getter lib, doesn't work with the `http://` schema prefix, but does work with | ||
/// an explicit s3 prefix `s::` or without a prefix. We've opted for the s3 prefix to be explicit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s::
-> s3::
Apparently the `s3::` protocol isn't a real thing, its something the go-getter lib created and call [Forced Protocols](https://github.com/hashicorp/go-getter#forced-protocol) However, AWS does support a real s3 protocol `s3://`, which I've tested with the go-getter lib and it does work. Even better, this is the same format as the s3 URI in the console, so its easy to copy paste a link
I'm tempted to spend some time actually adding http protocol support to go-getter, but that will still take some time to get merged into Nomad. In the meantime I've opened a ticket to clarify what s3 formats are actually supported: hashicorp/go-getter#387 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think i'd prever a grapl_config::env_helpers::is_local_aws()
but I know people have Very Strong Opinions on resurrecting IS_LOCAL
. In this case, it makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny documentation tweak, but otherwise 👍
Thanks for digging into the guts of this issue 🎉
|
||
/// This gets a s3 URI to be used by Nomad's artifact stanza (in particular the grapl-plugin). | ||
/// The underlying library, Hashicorp's [go-getter](https://github.com/hashicorp/go-getter) lib, | ||
/// doesn't work with the `http://` schema prefix, but does work with an explicit s3 prefix `s://` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s://
-> s3://
Which issue does this PR correspond to?
Fixes https://github.com/grapl-security/issue-tracker/issues/1045
What changes does this PR make to Grapl? Why?
This updates the s3 uri used with the Nomad artifact stanza in order to fix the 403 error we were seeing in AWS. This appears to be an issue with go-getter and http://, and can be seen in this thread: hashicorp/nomad#1113
An alternate fix would be to remove the prefix entirely (ie get rid of s3:: or the previous http://)
I've removed the comments about the slack thread, since that thread has been deleted (past the 90 day threshold or whatever it is)
How were these changes tested?