Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No stderr logging when docker image not found #271

Open
nick-youngblut opened this issue Sep 19, 2023 · 4 comments
Open

No stderr logging when docker image not found #271

nick-youngblut opened this issue Sep 19, 2023 · 4 comments

Comments

@nick-youngblut
Copy link

nick-youngblut commented Sep 19, 2023

If I run:

dsub \
  --provider local \
  --project ${PROJECT_ID} \
  --regions ${REGION} \
  --logging dsub-debug.log \
  --image ubuntu:20.04x \
  --output OUT=gs://${BUCKET_NAME}/output/out.txt \
  --command 'echo "Hello World" > "${OUT}"' \
  --wait

I get the following written to dsub-debug.log:

Unable to find image 'ubuntu:20.04x' locally
docker: Error response from daemon: manifest for ubuntu:20.04x not found: manifest unknown: manifest unknown.
See 'docker run --help'.

However, if I run the job on google-batch instead of local, the ${jobid}.stderr file set via the --logging parameter is empty.

The lack of an error messages in the *.stderr logging files created by google-batch makes it very hard to troubleshoot failed jobs.

I'm using dsub 0.4.9

@wnojopra
Copy link
Contributor

wnojopra commented Sep 19, 2023

Hi @nick-youngblut

Thanks for reporting. There was an update in the Batch API that changed the way log files were written to disk. We're in the process of making the dsub changes and pushing a new release. In the meantime, the same command should work with the google-cls-v2 provider.
[EDIT]: I misunderstood the issue. See #271 (comment)

@nick-youngblut
Copy link
Author

nick-youngblut commented Sep 19, 2023

Thanks @wnojopra for the quick feedback! If I use google-cls-v2, then no logging files are generated at gs://${BUCKET_NAME}/logging/:

dsub \
  --provider google-cls-v2 \
  --project ${PROJECT_ID} \
  --regions ${REGION} \
  --logging gs://${BUCKET_NAME}/logging/ \
  --image ubuntu:20.04x \
  --output OUT=gs://${BUCKET_NAME}/output/out.txt \
  --command 'echo "Hello World" > "${OUT}"' \
  --wait

It appears that if the --provider google-cls-v2 job fails (at least for a non-existent docker image), no --logging files are created. However, if the --provider google-cls-v2 job succeeds, the --logging files are created.

@wnojopra
Copy link
Contributor

Thanks @nick-youngblut . I misunderstood the issue. The confusion comes from how the API pulls the docker image. From https://cloud.google.com/life-sciences/docs/reference/rest/v2beta/projects.locations.pipelines/run#Action in the imageUri section it mentions "Note that all images referenced by actions in the pipeline are pulled before the first action runs."

Logging is done in one of the actions of a dsub job. It starts relatively early, but as noted the API will attempt to pull the image before the logging action starts. This is why you have no --logging files.

I do notice that dsub itself outputs a helpful error message in this case:

['Error in echo--willyn--230919-201554-73 - code 9: Execution failed: generic::failed_precondition: pulling image: docker pull: running ["docker" "pull" "ubuntu:20.04x"]: exit status 1 (standard error: "Error response from daemon: manifest for ubuntu:20.04x not found: manifest unknown: manifest unknown\\n")']
JobExecutionError: One or more jobs finished with status FAILURE or CANCELED during wait.

So in this case, the error couldn't make it to the logs, but it is output by dsub itself.

@wnojopra wnojopra changed the title No stderr logging for google-batch No stderr logging when docker image not found Sep 19, 2023
@nick-youngblut
Copy link
Author

Thanks @wnojopra for all of the information!

as noted the API will attempt to pull the image before the logging action starts

Would it be possible to continue the dsub job (try-except) to include the logging, regardless of whether the image pull was successful? Inconsistent logging behavior can be harder to troubleshoot (e.g., "in this case, check the logs... but in this other case there are no logs, so you have to check...").

I do notice that dsub itself outputs a helpful error message in this case:

Thanks for pointing that out! Is there a way to get a "pretty" print of the logs directly in dsub? Each line of the logs is so long that reading them (mentally parsing the logs) can be a challenge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants