-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix gateway handle list objects versions #7028
Conversation
I believe @arielshaqed had a wider gateway's operations scope in #7025, i.e. every S3 operation that is handled by our gateway as a different operation. If it's just this fix, YaY! |
As we love small PRs - so I'll do it incrementally. From a quick check this solved the bug I found on how we identify different API calls related to the problem described. The main issue is how to handle unrecognized query string parameters in the S3 protocol. The S3 protocol relies on HTTP methods and query string parameters to identify the intended operation for each user call. However, the current protocol definition covers around 70 operations. In response to calls with unfamiliar query string parameters, we can either reject them or implement default behaviors in specific cases. The appropriate approach requires careful consideration to maintain compatibility and ensure seamless user experience. The current implementation handle specific cases - like ignore "version" for list, get and etc. |
Switch to draft - will introduce all the required changes |
Not sure that I agree. I believe that we should precisely parse the operation, and then apply it. The fact that something like "version" makes our lives harder only means that we need to be more careful here. We should understand all query parameters. The good news to me is that we only implement a tiny set of operations! Also we might look again at how MinIO do this. |
79e7e22
to
3ef34d9
Compare
re-implemented the solution to report unsupported operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These huge lists of params to HandleUnsupported
scare me. For one thing they mean that if AWS add any new call to S3, lakeFS will claim to support it.
Is there no way of performing a positive check for whether we support an S3 method?
My concern is with additional parameters that may be passed to the same call that currently handled, ex: list objects v2 with metadata and handled by the same operation but without metadata support. Will open an issue to refactor the s3 gateway router to do a better job in matching the handler to the commands based on the method/query params. In order not to break something added a flag to disable the unsupported checks that were added. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only blocking change is the name of the config variable. (But you know me -- I really like my logs, I just won't block this PR on that.)
case ref != "" && pth != "": | ||
o.OperationID = pathBasedOperationID(req.Method) | ||
case ref == "" && pth == "": | ||
o.OperationID = repositoryBasedOperationID(req.Method) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're missing a default
case here! Even if we keep going in order to fail later, I would really like a log (level TRACE) for each of the four cases. Otherwise debugging a bad request will be that much harder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The o.OperationID = operations.OperationIDOperationNotFound
assigned before the switch case, I wanted to log this request as operation not found and let it fall into the handler.
In the previous version, it was odd that we had handler in the switch that returns after w.WriteHeader(http.StatusNotFound)
and in case pathBasedOperationID
returns OperationIDOperationNotFound
we continue to fall and handle it after logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still open? Reminder, there are 2 things here
- (Smaller) There is no default case. I'd rather not have the default here in a default not up on l.202.
- (Larger) I'd like to log (trace) the params repo, ref, pth, and which type of operation we picked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still open.
query := req.URL.Query() | ||
if !query.Has("delete") { | ||
_ = o.EncodeError(w, req, nil, gerrors.ERRLakeFSNotSupported.ToAPIErr()) | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we HandleUnsupported
here? One day we'll add delete-part or something, and even before that this is less uniform so I need to think harder when reading it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing here was that POST on bucket/repo level is postpolicybucket
operations (no query parameters) and the only other operation in this level is "deletemultipleobjects
(query parameter "delete").
In all other cases HandleUnsupported
handled the non-default case, and here we don't handle the default case.
As I didn't wanted to manage a whitelist, at least until we will have a better router for the S3 gateway, I thought this will make more sense.
Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pleased? (Or say no, that also works -- prefer not to leave this open)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really not like to delete someone's object because they did some random POST. But if we cannot lock it down, we cannot lock it down...
pkg/gateway/operations/getobject.go
Outdated
@@ -28,6 +28,10 @@ func (controller *GetObject) RequiredPermissions(_ *http.Request, repoID, _, pat | |||
} | |||
|
|||
func (controller *GetObject) Handle(w http.ResponseWriter, req *http.Request, o *PathOperation) { | |||
// TODO(barak): do we like to reject tagging or return empty tagging? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pick one. I'd say reject, but I definitely don't want get-tags to return the object contents -- which AFAICT the current code does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kept the previous implementation - return empty tagging response
♻️ PR Preview d47dc98 has been successfully destroyed since this PR has been closed. 🤖 By surge-preview |
@arielshaqed thanks - update the code+doc except one comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Just a few comments left open.
case ref != "" && pth != "": | ||
o.OperationID = pathBasedOperationID(req.Method) | ||
case ref == "" && pth == "": | ||
o.OperationID = repositoryBasedOperationID(req.Method) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still open? Reminder, there are 2 things here
- (Smaller) There is no default case. I'd rather not have the default here in a default not up on l.202.
- (Larger) I'd like to log (trace) the params repo, ref, pth, and which type of operation we picked.
query := req.URL.Query() | ||
if !query.Has("delete") { | ||
_ = o.EncodeError(w, req, nil, gerrors.ERRLakeFSNotSupported.ToAPIErr()) | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pleased? (Or say no, that also works -- prefer not to leave this open)
@arielshaqed I forgot to submit my comments with my last pr update |
|
c43792f
to
551ab3f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Do we want a tech debt issue to improve the S3 operation multiplexor?
case ref != "" && pth != "": | ||
o.OperationID = pathBasedOperationID(req.Method) | ||
case ref == "" && pth == "": | ||
o.OperationID = repositoryBasedOperationID(req.Method) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still open.
query := req.URL.Query() | ||
if !query.Has("delete") { | ||
_ = o.EncodeError(w, req, nil, gerrors.ERRLakeFSNotSupported.ToAPIErr()) | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really not like to delete someone's object because they did some random POST. But if we cannot lock it down, we cannot lock it down...
Open #7100 to address router/mux |
@arielshaqed about lock |
Fix #7025
getbucketlocation
API call based on s3 configured region