-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kube play: cannot unmarshal ... PodRmReport.RmReport.Err #16154
Comments
An interesting side note: two of these three failures are associated with #15367 (the everything-hosed-forever flake): as soon as this |
A friendly reminder that this issue had no activity for 30 days. |
A weird thing is that there is no |
Three new instances, but slightly different. These are all
All of them in |
Seen Jan 27 in sys remote f37 rootless:
Summary so far: happening both root and rootless, int and sys, f36 and f37, and still happening many months after first reported. |
I guess something like diff --git a/pkg/bindings/errors.go b/pkg/bindings/errors.go
index d9dfa95a6..af7ab0c45 100644
--- a/pkg/bindings/errors.go
+++ b/pkg/bindings/errors.go
@@ -7,6 +7,7 @@ import (
"io"
"github.com/containers/podman/v4/pkg/errorhandling"
+ "github.com/sirupsen/logrus"
)
var (
@@ -15,7 +16,8 @@ var (
func handleError(data []byte, unmarshalErrorInto interface{}) error {
if err := json.Unmarshal(data, unmarshalErrorInto); err != nil {
- return err
+ logrus.Errorf("Error unmarshaling into %#v, data %q", unmarshalErrorInto, data)
+ panic("handleError failed")
}
return unmarshalErrorInto.(error)
}
@@ -35,7 +37,10 @@ func (h APIResponse) ProcessWithError(unmarshalInto interface{}, unmarshalErrorI
}
if h.IsSuccess() || h.IsRedirection() {
if unmarshalInto != nil {
- return json.Unmarshal(data, unmarshalInto)
+ if err := json.Unmarshal(data, unmarshalInto); err != nil {
+ logrus.Errorf("Error unmarshaling into %#v, data %q", unmarshalInto, data)
+ panic("APIResponse.ProcessWithError failed")
+ }
}
return nil
} could help diagnose, but if this is an infrequent flake we can’t trigger, a draft-only PR probably wouldn’t help, and I’m not comfortable with shipping those Maybe we could just log the data and type in the error, and keep that shipped on the main branch until this triggers again… |
Hey, could "unexpected end of JSON input" be a variation of this bug? Seen in podman machine test, f37 aarch64 rootless:
|
Let’s start with that: #17282 . |
Yay!. Does this help?
(Sorry for the long line) |
It definitely does. The underlying cause, in
That looks rather like some other flakes, e.g. #11594 (comment) And then there seems to be a design issue in error handling of the Kube commands: With the disclaimer that I’m very new to this code, and I am just reading it, not testing it in practice: Summary: The E.g. consider how podman/pkg/domain/infra/tunnel/pods.go Line 171 in 90b18d2
PodRmReport.Err field is manually set by the remote’s client caller, i.e. it is not transferred directly as JSON. (Well, actually it is, podman/pkg/bindings/pods/pods.go Line 212 in 90b18d2
Err field, but as long as there is no error, the value is nil and that is passed through JSON just fine.)
If the server fails, a different HTTP status triggers transfer not of podman/pkg/errorhandling/errorhandling.go Line 93 in 90b18d2
PodRmReport.Err .
That’s the way this seems to work for the standalone Fixing this seems to mean changing the structure of And as much as I originally looked into this bug because of a stupid oversight (#16154 (comment) mixes up type and field names, there is actually no mystery in that), I’d like to leave this |
Seen just now in f37 remote rootless, and again everything after that is hosed (#17216) even though it's not a "podman rm":
|
Now seen in debian also. remote, root. |
f37 remote root, and once it happens, all future tests are hosed:
I'm really quite certain this is related to #17216 |
My understanding, per the above, is that the #17216 underlying cause is the same, but on top, this demonstrates a bug in |
That's a fair point. Let's reopen to look into the error reporting. |
Had a quick look. The underlying issue is that the returned type includes a |
Here's a recent one:
|
Three instances, first in August. Root/rootless. Always remote.
The text was updated successfully, but these errors were encountered: