Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some *useful* Pod status do not bubble up in TaskRun status #4802

Closed
vdemeester opened this issue Apr 27, 2022 · 9 comments
Closed

Some *useful* Pod status do not bubble up in TaskRun status #4802

vdemeester opened this issue Apr 27, 2022 · 9 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature.

Comments

@vdemeester
Copy link
Member

Sligtly related to #4801, when a Pod doesn't fail but is in a "long-term" un-reconciliable state, such as InvalidImageName, the TaskRun controller ignore those and wait forever (aka timeout).

This happens because it a fixable error from the Pod standpoint (you can mutate the image of a container in a Pod), and thus doesn't seem to be considered as a "terminal" error.

The TaskRun controller should pickup these kind of error (at least InvalidImageName, but most likely there might be a few other status that make sense) and fail with a clear message/reason based on it.

@vdemeester vdemeester added kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. labels Apr 27, 2022
@yuzp1996
Copy link
Contributor

yuzp1996 commented May 2, 2022

Hi, I am interested in this issue. May I take this one?

If I need any help I will mention it here. Thank you!

@yuzp1996
Copy link
Contributor

yuzp1996 commented May 2, 2022

/assign yuzp1996

@yuzp1996
Copy link
Contributor

yuzp1996 commented May 2, 2022

image

image

I think I have reproduced the error here. The pod is in the status of ImagePullBackOff while the related Taskrun is pending.

I think the expected result is that the Taskrun should fail with a clear message/reason based on it.

If I misunderstand something please tell me. Thank you!

@yuzp1996
Copy link
Contributor

yuzp1996 commented May 2, 2022

Hi, I check the code and find maybe we can do some improvement here.
image

I think this is not a bug fix but a new feature. Maybe I need to provide a TEP to describe the goal that we want to achieve and the code change that I need to make.

I will be very happy if you can give me some advice. @vdemeester Thanks!

@vdemeester
Copy link
Member Author

So ImagePullBackOff is not a terminal error as I think if the image becomes available later (before the TaskRun times out), the Pod will start and thus the Task as well — but maybe it's not 😅 .

Yeah this is probably a new feature, but I don't think it needs a TEP, just a proper pull request explaining the new errors we handle should be fine.

/cc @imjasonh @dibyom @jerop @abayer @lbernick

@yuzp1996
Copy link
Contributor

yuzp1996 commented May 5, 2022

OK, thanks! I will do it this week.

yuzp1996 added a commit to yuzp1996/pipeline that referenced this issue May 8, 2022
fix/feature: tektoncd#4802
@yuzp1996
Copy link
Contributor

yuzp1996 commented May 8, 2022

This is all the image related error I found in the k8s kubelet code base. I will try to bubble up all these errors in TaskRun status.

image

yuzp1996 added a commit to yuzp1996/pipeline that referenced this issue May 12, 2022
When pod is pending because of failure of pulling image the users
can not get the clear reason or message from the taskrun status.

Now I will check the pod status and judge if the error is caused
by image if so the error reason and message will bubble up to
taskrun status. Then user can know what happend though the taskrun
status.

Related issue: tektoncd#4802

Signed-off-by: yuzhipeng <[email protected]>
yuzp1996 added a commit to yuzp1996/pipeline that referenced this issue May 12, 2022
When pod is pending because of failure of pulling image the users
can not get the clear reason or message from the taskrun status.

Now I will check the pod status and judge if the error is caused
by image if so the error reason and message will bubble up to
taskrun status. Then user can know what happend though the taskrun
status.

Related issue: tektoncd#4802

Signed-off-by: yuzhipeng <[email protected]>
yuzp1996 added a commit to yuzp1996/pipeline that referenced this issue May 12, 2022
When pod is pending because of failure of pulling image the users
can not get the clear reason or message from the taskrun status.

Now I will check the pod status and judge if the error is caused
by image if so the error reason and message will bubble up to
taskrun status. Then user can know what happend though the taskrun
status.

Related issue: tektoncd#4802

Signed-off-by: yuzhipeng <[email protected]>
yuzp1996 added a commit to yuzp1996/pipeline that referenced this issue May 18, 2022
When pod is pending because of failure of pulling image the users
can not get the clear reason or message from the taskrun status.

Now I will check the pod status and judge if the error is caused
by image if so the error reason and message will bubble up to
taskrun status. Then user can know what happend though the taskrun
status.

Related issue: tektoncd#4802

Signed-off-by: yuzhipeng <[email protected]>
tekton-robot pushed a commit that referenced this issue May 19, 2022
When pod is pending because of failure of pulling image the users
can not get the clear reason or message from the taskrun status.

Now I will check the pod status and judge if the error is caused
by image if so the error reason and message will bubble up to
taskrun status. Then user can know what happend though the taskrun
status.

Related issue: #4802

Signed-off-by: yuzhipeng <[email protected]>
@lbernick
Copy link
Member

/close

fixed in #4846, thanks @yuzp1996!

@tekton-robot
Copy link
Collaborator

@lbernick: Closing this issue.

In response to this:

/close

fixed in #4846, thanks @yuzp1996!

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

4 participants