-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add events to indicate start and end of skaffold dev iterations #4037
Add events to indicate start and end of skaffold dev iterations #4037
Conversation
Codecov Report
|
proto/skaffold.proto
Outdated
@@ -344,11 +353,32 @@ enum ErrorCode { | |||
STATUS_CHECK_NODE_NOT_READY= 406; | |||
|
|||
// Unknown Error Codes | |||
ErrorCode_UNKNOWN = 501; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from COULD_NOT_DETERMINE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COULD_NOT_DETERMINE
or 0 value is default enum. In cases where we don't send any error Code the value is 0. To differentiate between when it is empty Vs when we know its an unknown error, i added the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not enforce that the error-code must be explicitly provided?
pkg/skaffold/errors/errors.go
Outdated
func ErrorCodeFromError(_ error, _ phase) proto.ErrorCode { | ||
return proto.ErrorCode_COULD_NOT_DETERMINE | ||
func ErrorCodeFromError(_ error, _ Phase) proto.ErrorCode { | ||
return proto.ErrorCode_ErrorCode_UNKNOWN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COULD_NOT_DETERMINE
seems more apt here?
It feels like we should have an error type that carries the ErrorCode inside, rather than returning an error-code and an error. Then we could get rid of STATUS_CHECK_NO_ERROR
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make sense to add a new proto ErrorDef
with those 2 fields. However, we won't be able to change that for other events BuildEvent
, DeployEvent
, StatusCheckEvent
etc since IDEs are already consuming it.
Re: STATUS_CHECK_NO_ERR
, it was something we discussed for regarding following HTTP error codes for success. It would also be easy for us to understand metrics if we define a success code vs "0" since 0 is the default value for int type enum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back through this, I'm confused by the intent of this function. It's a placeholder? Can we hook in the pattern-matching aspects of #4045 here?
Or we we could our own error
that carries an ErrorCode? Like the os.Err*
variables and os.IsExist()
.
example
type ErrorWithCode interface {
error
ErrorCode() ErrorCode
}
type errorWithCode struct {
errorCode ErrorCode
err error
}
func (e errorWithCode) ErrorCode() ErrorCode {
return e.errorCode
}
func (e errorWithCode) Error() string {
return fmt.Sprintf("%d: %s", e.errorCode, e.err.Error())
}
func NewErrorCode(errorCode ErrorCode, err error) ErrorWithCode {
return errorWithCode{errorCode: errorCode, err: err}
}
pkg/skaffold/errors/errors.go
Outdated
Deploy = Phase("Deploy") | ||
StatusCheck = Phase("StatusCheck") | ||
FileSync = Phase("FileSync") | ||
Dev = Phase("Dev") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Dev
really a phase? (We don't actually use Dev from what I can tell?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not a phase, but dev-init is a stage or step in dev loop where errors can happen.
I wanted to capture this "init" step errors.
Shd "DevInit" be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, i can rename "Phase" to "Stage"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! So these are phases of the dev loop? That makes sense, and could you add a comment to that effect please? (It seems mildly odd for this to be defined in skaffold/errors
.) I think Dev
→ DevInit
or just Init
makes sense, and perhaps we should have a TearDown
too.
(I had wondered if we need to differentiate between dev and debug, but the dev loop is the same so I think it's good.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change Dev -> DevInit
Add TearDown
Add comment saying these are phases in dev loop.
581e583
to
6e99b4c
Compare
Thanks a lot for all your feedback @briandealwis . I have addressed most of the comments and refactored method signatures. Defining a new struct I am wiring up Actionable Error messages here #4045 and will propagate suggestions in the events. |
5dff70e
to
6889a05
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So metaEvent
is the session start event, and endEvent
is the session end? Are these supposed to be sent on every command or just the dev loop?
cmd/skaffold/app/cmd/commands.go
Outdated
// The actual error code will be passed in | ||
// https://github.com/GoogleContainerTools/skaffold/pull/4045 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem like the right place given that the MetaEvent is raised in runner.NewForConfig()
. I believe this will cause things like fix
to throw session events too. (Not to mention there's ExactArgs()
too.) Can we not have it be thrown at the end of the runner?
(Or if MetaEvent and EndEvent are ok for things like build
, deploy
, fix
, etc. then could we move the start/stop here.)
pkg/skaffold/errors/errors.go
Outdated
Deploy = Phase("Deploy") | ||
StatusCheck = Phase("StatusCheck") | ||
FileSync = Phase("FileSync") | ||
Dev = Phase("Dev") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! So these are phases of the dev loop? That makes sense, and could you add a comment to that effect please? (It seems mildly odd for this to be defined in skaffold/errors
.) I think Dev
→ DevInit
or just Init
makes sense, and perhaps we should have a TearDown
too.
(I had wondered if we need to differentiate between dev and debug, but the dev loop is the same so I think it's good.)
pkg/skaffold/errors/errors.go
Outdated
func ErrorCodeFromError(_ error, _ phase) proto.ErrorCode { | ||
return proto.ErrorCode_COULD_NOT_DETERMINE | ||
func ErrorCodeFromError(_ error, _ Phase) proto.ErrorCode { | ||
return proto.ErrorCode_ErrorCode_UNKNOWN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back through this, I'm confused by the intent of this function. It's a placeholder? Can we hook in the pattern-matching aspects of #4045 here?
Or we we could our own error
that carries an ErrorCode? Like the os.Err*
variables and os.IsExist()
.
example
type ErrorWithCode interface {
error
ErrorCode() ErrorCode
}
type errorWithCode struct {
errorCode ErrorCode
err error
}
func (e errorWithCode) ErrorCode() ErrorCode {
return e.errorCode
}
func (e errorWithCode) Error() string {
return fmt.Sprintf("%d: %s", e.errorCode, e.err.Error())
}
func NewErrorCode(errorCode ErrorCode, err error) ErrorWithCode {
return errorWithCode{errorCode: errorCode, err: err}
}
proto/skaffold.proto
Outdated
message ActionableErr { | ||
ErrorCode errCode = 1; // error code representing the error | ||
string message = 2; // message describing the error. | ||
repeated string suggestions = 3; // a list of suggestions for the error. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think reporting messages here is the right thing to do. I'm wondering if it would be better to have a secondary set of possible action codes that could be returned instead. That would also allow i18n or more.
(I think it's also an opportunity to somehow segregate PII fields, like artifact names or image tags, so that we might be able to report the actual errors seen?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I should have made this clear. The suggestion text is just for IDEs to show to User.
Having suggestions plumbed through skaffold will reduce the duplication of efforts in both IDEs
I will add that as a comment.
For metrics, we should probably report codes like you mentioned.
pkg/skaffold/event/event.go
Outdated
logEntry.Entry = fmt.Sprintf("Dev Iteration %d in progress", de.Iteration) | ||
case Succeeded: | ||
logEntry.Entry = fmt.Sprintf("Dev Iteration %d successful", de.Iteration) | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change this case Failed:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
This reverts commit ca80757.
5a540c2
to
b467110
Compare
b467110
to
87a56c4
Compare
87a56c4
to
9a2f0ff
Compare
}, | ||
{ | ||
"name": "event.devLoopEvent.err.errCode", | ||
"description": " - UNKNOWN_ERROR: Could not determine error and phase\n - STATUSCHECK_SUCCESS: Status Check Success\n - STATUSCHECK_IMAGE_PULL_ERR: Container image pull error\n - STATUSCHECK_CONTAINER_CREATING: Container creating error\n - STATUSCHECK_RUN_CONTAINER_ERR: Container run error\n - STATUSCHECK_CONTAINER_TERMINATED: Container is already terminated\n - STATUSCHECK_CONTAINER_RESTARTING: Container restarting error\n - STATUSCHECK_NODE_MEMORY_PRESSURE: Node memory pressure error\n - STATUSCHECK_NODE_DISK_PRESSURE: Node disk pressure error\n - STATUSCHECK_NODE_NETWORK_UNAVAILABLE: Node network unavailable error\n - STATUSCHECK_NODE_PID_PRESSURE: Node PID pressure error\n - STATUSCHECK_NODE_UNSCHEDULABLE: Node unschedulable error\n - STATUSCHECK_NODE_UNREACHABLE: Node unreachable error\n - STATUSCHECK_NODE_NOT_READY: Node not ready error\n - STATUSCHECK_UNKNOWN: Status Check error unknown\n - STATUSCHECK_UNKNOWN_UNSCHEDULABLE: Container is unschedulable due to unknown reasons\n - STATUSCHECK_CONTAINER_WAITING_UNKNOWN: Container is waiting due to unknown reason\n - DEPLOY_UNKNOWN: Deploy failed due to unknown reason\n - SYNC_UNKNOWN: SYNC failed due to known reason\n - BUILD_UNKNOWN: Build failed due to unknown reason\n - DEVINIT_UNKNOWN: Dev Init failed due to unknown reason\n - CLEANUP_UNKNOWN: Cleanup failed due to unknown reason\n - SYNC_INIT_ERROR: File Sync Initialize failure\n - DEVINIT_REGISTER_BUILD_DEPS: Failed to configure watcher for build dependencies in dev loop\n - DEVINIT_REGISTER_TEST_DEPS: Failed to configure watcher for test dependencies in dev loop\n - DEVINIT_REGISTER_DEPLOY_DEPS: Failed to configure watcher for deploy dependencies in dev loop\n - DEVINIT_REGISTER_CONFIG_DEP: Failed to configure watcher for Skaffold configuration file.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do something to make this a more manageable string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was recently changed in #4083 . I will follow up later.
/cc @quoctruong
Fixes: #4036
Related: Relevant tracking issues, for context
Merge before/after: Dependent or prerequisite PRs
Description
In this PR,
User facing changes (remove if N/A)
No user facing changes.
Tools consuming Event API will see following new events in the sequence.
Follow-up Work (remove if N/A)