fix: Attempt to recover/invoke function when docker connectivity is poor #764

jfuss · 2018-11-12T21:13:59Z

Issue #, if available:
#529 and #451

Description of changes:
When working completely offline, SAM CLI fails when trying to download the docker image. Instead of failing right away, we will check to see if we have an older image that we could use instead. If we do not have an image locally, we fail gracefully with a message describing we could not download the image.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

sanathkr

I have two questions. not blocking but want to clarify

sanathkr · 2018-11-13T16:56:22Z

samcli/commands/local/invoke/cli.py

@@ -98,6 +99,8 @@ def do_cli(ctx, function_identifier, template, event, no_event, env_vars, debug_
        raise UserException("Function {} not found in template".format(function_identifier))
    except (InvalidSamDocumentException, OverridesNotWellDefinedError) as ex:
        raise UserException(str(ex))
+    except DockerImagePullFailedException as ex:


what about start_api and start_lambda clis? we should be capturing this exception somewhere right? If not, we should certainly catch and print pretty error message instead of a stack trace

I elected for the stack trace for these. Mainly because I didn't have a good way to surface this. Take start-api for example. When calling a localhost endpoint, the response from the API will be a service error. We could catch the error and 'pretty print' something but figured the stack trace is enough for now (since this is the current behavior).

I would like to think more through this case in general (how we behave offline) but not halt that on providing some benefit now. I have no problem adding this now, just explaining my thought process of why I initially didn't.

I see what you mean. If you don't catch the exception, then it is going to crash the process and print a stack trace right?

It would be rather better to print a proper error message and terminate the CLI than a crash.

The process doesn't crash. We rely on Flask for this. Flask will see the exception and then wrap this in a service 500 error: https://github.com/awslabs/aws-sam-cli/blob/develop/samcli/local/lambda_service/local_lambda_invoke_service.py#L114.

This is the same behavior for start-api. We only catch the errors we care about and handle them specifically, everything else gets rolled up into a 500 Error. The output will still show the stack trace of the error in the terminal.

Ah yes, I forgot about Flask. This makes sense and is consistent with a service-like behavior.

sanathkr · 2018-11-13T17:36:57Z

samcli/local/docker/manager.py

        """
        stream = stream or sys.stderr
        try:
            result_itr = self.docker_client.api.pull(image_name, stream=True, decode=True)
-        except docker.errors.ImageNotFound as ex:
-            raise DockerImageNotFoundException(str(ex))
+        except docker.errors.APIError as ex:


[nit] Looks like the default timeout on Docker Client is 60 seconds. Is it possible to add timeout to the docker_client.api.pull call as well that won't hog for 60 seconds if it can't find a network?

Good callout. I don't see this as an option for in docker_client.api.pull but this is configurable in the docker_client. This will affect poor network instead of no network (I have been testing locally with wifi off without waiting for the timeout). I can update the docker_client we use but what would be a reasonable timeout for this?

This PR doesn't handle the poor network case and I didn't aim to solve this here.

It’s a good practice to set timeouts anyway. In this case we can be more aggressive. If you want to fix poor network later, I am okay.

Agree. I want to tackle the greater poor network connectivity problem separately from this PR.

fix: Attempt to recover/invoke function when docker connectivity is poor

da26185

jfuss requested a review from sriram-mv November 12, 2018 21:14

sriram-mv approved these changes Nov 13, 2018

View reviewed changes

sanathkr reviewed Nov 13, 2018

View reviewed changes

sanathkr approved these changes Nov 14, 2018

View reviewed changes

jfuss merged commit 1fb22b1 into aws:develop Nov 15, 2018

jfuss mentioned this pull request Dec 27, 2018

500 error when running offline #451

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Attempt to recover/invoke function when docker connectivity is poor #764

fix: Attempt to recover/invoke function when docker connectivity is poor #764

jfuss commented Nov 12, 2018

sanathkr left a comment

sanathkr Nov 13, 2018

jfuss Nov 13, 2018

sanathkr Nov 13, 2018

jfuss Nov 13, 2018

sanathkr Nov 14, 2018

sanathkr Nov 13, 2018

jfuss Nov 13, 2018

sanathkr Nov 14, 2018

jfuss Nov 15, 2018

fix: Attempt to recover/invoke function when docker connectivity is poor #764

fix: Attempt to recover/invoke function when docker connectivity is poor #764

Conversation

jfuss commented Nov 12, 2018

sanathkr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment