-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/log unavailable warning #3848
Fix/log unavailable warning #3848
Conversation
Hi @jonasdebeukelaer. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
feels like this failing build has nothing to do with ui? 🤔 broken build shows
|
8f75ccc
to
88c8729
Compare
@jonasdebeukelaer Thanks a lot for your contribution! This is absolutely awesome for UX. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Bobgy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Yes, the travis unit tests are flaky. I'll help you retry them. |
/retest |
/test kubeflow-pipeline-frontend-test |
88c8729
to
5bbe7b5
Compare
changes:
node: I've made namespace required for Apis.getPodLogs, not sure why it was optional before? |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! I've left some minor comments and nitpickings.
frontend/src/pages/RunDetails.tsx
Outdated
logsBannerMessage = 'Failed to retrieve pod logs.'; | ||
|
||
// if pod can't be found, then assume it has gone from the server | ||
if (await this.podNotFound(selectedNodeDetails.id, namespace)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: just confirm the error message returned from Apis.getPodLogs
has no hint whether the pod is not found?
If no, this seems a reasonable workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does allude to workflow not being found, or something along those lines, but the error code is 500, so would have to do a msg contains text type check. Feels a bit brittle but happy to do that instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with text check in this PR, if you feel you have capacity, we can also detect and change the error code in
stream.on('error', err => res.status(500).send('Could not get main container logs: ' + err)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh I thought this was interfacing straight with argo 🤦 . This makes sense then yeah , will do here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I've updated this but don't have time to stand up the whole backend to test it. is there some kind of short cut way to do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a running KFP cluster?
If yes, you can follow https://github.com/kubeflow/pipelines/blob/master/frontend/README.md#proxy-to-a-real-cluster, you will need to develop "Client UI + Node server". It will send api requests to the cluster for anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately, we haven't added integration tests for pod logs api method.
pipelines/frontend/server/app.test.ts
Lines 1041 to 1042 in d6920ca
// TODO: Add integration tests for k8s helper related endpoints | |
// describe('/k8s/pod/logs', () => {}); |
You can also use integration test as a way for testing it too, I think there are plenty of example test cases next to it. Feel free to ask me if you need any help.
frontend/src/pages/RunDetails.tsx
Outdated
sidepanelBannerMode = 'error'; | ||
break; | ||
case NodePhase.FAILED: | ||
sidepanelBannerMode = 'warning'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I don't have much context here, can you help me understand the difference between ERROR
and FAILED
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be honest I'm not 100% of this either, but I would understand this to be
- ERROR if there was a error trying to set up the container or something, or network error, or issue in argo, and
- FAILED means the execution of the container failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, I'm not sure I understand why you want to show warning
for failed node.
For users, it seems to me they should also perceive it as an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's to try to separate pipelines failure from system failure, but happy to change to error type too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. IMO, can we add some wording in the error message to indicate difference between fail
and error
.
e.g. error
-> Pipelines System Error: original message
; while fail
-> Failed execution: ...
what do you think?
This might be a little controversial, we can also revert this part keeping current behavior, so that we can get this PR merged. Opening a new PR for this specific issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I'll make em all Error type, and we can add a separate issue to add those prefixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Thank you!
@Bobgy oops totally missed your update! Should have some time to fix this evening 👍 |
5bbe7b5
to
47a372e
Compare
/retest |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonasdebeukelaer Great work!
Left some comments
frontend/src/pages/RunDetails.tsx
Outdated
sidepanelBannerMode = 'error'; | ||
break; | ||
case NodePhase.FAILED: | ||
sidepanelBannerMode = 'warning'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, I'm not sure I understand why you want to show warning
for failed node.
For users, it seems to me they should also perceive it as an error.
47a372e
to
c3270dd
Compare
frontend/src/pages/RunDetails.tsx
Outdated
sidepanelBannerMode = 'error'; | ||
break; | ||
case NodePhase.FAILED: | ||
sidepanelBannerMode = 'warning'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. IMO, can we add some wording in the error message to indicate difference between fail
and error
.
e.g. error
-> Pipelines System Error: original message
; while fail
-> Failed execution: ...
what do you think?
This might be a little controversial, we can also revert this part keeping current behavior, so that we can get this PR merged. Opening a new PR for this specific issue.
}} | ||
/>, | ||
); | ||
expect(tree.findWhere(el => el.text() === 'Refresh')).toEqual({}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't quite understand, why toEqual({})
?
Let me take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified, toEqual({})
will succeed no matter nodes found or not, because the wrapper helper object has no direct properties all the time.
Please use this way to verify not found:
pipelines/frontend/src/components/viewers/Tensorboard.test.tsx
Lines 323 to 330 in fcd2559
expect( | |
tree | |
.findWhere( | |
el => | |
el.text() === `Tensorboard is starting, and you may need to wait for a few minutes.`, | |
) | |
.exists(), | |
).toEqual(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh thanks for that, sorry I'm quite new to ts tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, you are doing great
c3270dd
to
f9627d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work, thanks a lot for your continued efforts!
I'm okay with no additional test for pod logs for now, because the change is small and easily reviewable.
Would you mind updating the last minor issue?
/lgtm
err?.message && | ||
err.message?.indexOf('Unable to find pod log archive information') > -1 | ||
) { | ||
res.status(404).send('pod not found'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: also append err
to make sure we can debug if there are false-positives of 404s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohhh, I forgot I approved before.
Feel free to update it in a following up PR
* [UI] only allow troubleshoot link on error banner * [UI] improve use of banners in run view sidepanel * [UI] add info type banner
* [UI] only allow troubleshoot link on error banner * [UI] improve use of banners in run view sidepanel * [UI] add info type banner
* [UI] only allow troubleshoot link on error banner * [UI] improve use of banners in run view sidepanel * [UI] add info type banner
Fixes #3711