Downgrade `docker-compose.yaml` to version 3.3 so that we can support Ubuntu 20.04.4 LTS #329

andygrove · 2022-10-08T17:15:06Z

Which issue does this PR close?

N/A

Rationale for this change

I could not build Docker images on Ubuntu 20.04.4 LTS

What changes are included in this PR?

Specify version 3.3 in docker-compose.yaml

Are there any user-facing changes?

andygrove · 2022-10-08T17:21:14Z

docker-compose.yml

@@ -54,8 +54,7 @@ services:
    volumes:
      - ./benchmarks/data:/data
    depends_on:
-      ballista-scheduler:
-        condition: service_healthy


We should be able to start executors before the scheduler runs, but I haven't tested this in a long time. If we require the scheduler to be running first, then we should file a bug for that.

I tested this and the executor fails if it cannot connect to the scheduler. In theory, this should be fine in an environment such as k8s because the pod will just keep restarting until it succeeds.

This is docker-compose, not k8s - so maybe we need to mess with restart policy? @iajoiner might have more to say.

The restart config option could be added to the compose file. https://docs.docker.com/compose/compose-file/#restart

restart:always

A restart policy only takes effect after a container starts successfully. In this case, starting successfully means that the container is up for at least 10 seconds and Docker has started monitoring it. This prevents a container which does not start at all from going into a restart loop.

😢

I am tempted to add a --retry-timeout--seconds 10 option to the executor, which would fix this. Ballista is targeted at enterprise users, and they are often not on the latest distro, so would be nice for this to work out of the box for as many people as possible.

I was thinking of the same option given that the application does not seem to have a hard constraint to be readily available within milliseconds on startup. I second the part with having it working out of the box for different users.

@andygrove - I am not too familiar with the codebase yet, but I saw that if the scheduler is killed while the executor is running, the executor tries to reconnect forever. Once the scheduler is restarted, the executor is fine. Probably a silly question, but couldn't this behaviour be there from the start? Why does it first need to connect to an actual running instance of the scheduler, instead of keep trying the default/configured one until it succeeds?

TaskRunnerPool loops forever, polling schedulers to fetch tasks to run. This code opens a new connection to the scheduler each time, which is inefficient but does mean that it eventually recovers from scheduler downtime.

I pushed a commit to add the new config. This fixes it for me.

ballista-executor_1 | 2022-10-15T15:44:09.490225Z WARN main ThreadId(01) ballista_executor: Failed to connect to scheduler at http://ballista-scheduler:50050 (Could not connect to scheduler); retrying ... ballista-executor_1 | 2022-10-15T15:44:09.991817Z INFO main ThreadId(01) ballista_executor: Connected to scheduler at http://ballista-scheduler:50050

TaskRunnerPool loops forever, polling schedulers to fetch tasks to run. This code opens a new connection to the scheduler each time, which is inefficient but does mean that it eventually recovers from scheduler downtime.

I see, I thought it might be the case. Thanks for taking the time to explain!

avantgardnerio

LGTM

andygrove · 2022-10-15T21:18:18Z

Thanks for the reviews @avantgardnerio and @onthebridgetonowhere

Use Docker version 3.3 so that we can support Ubuntu 20.04.4 LTS

31c3da2

andygrove commented Oct 8, 2022

View reviewed changes

andygrove changed the title ~~Use Docker version 3.3 so that we can support Ubuntu 20.04.4 LTS~~ Downgrade docker-compose.yaml to version 3.3 so that we can support Ubuntu 20.04.4 LTS Oct 8, 2022

andygrove mentioned this pull request Oct 8, 2022

Add helm chart #322

Merged

andygrove added 4 commits October 10, 2022 07:11

fix labeler

42433f0

Merge branch 'master' into downgrade-docker-compose-version

4707f3b

executor retry connect to scheduler

d39fae5

log info message when connected to scheduler

7c37b02

avantgardnerio approved these changes Oct 15, 2022

View reviewed changes

andygrove merged commit 9ad583e into apache:master Oct 15, 2022

andygrove deleted the downgrade-docker-compose-version branch October 15, 2022 21:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Downgrade `docker-compose.yaml` to version 3.3 so that we can support Ubuntu 20.04.4 LTS #329

Downgrade `docker-compose.yaml` to version 3.3 so that we can support Ubuntu 20.04.4 LTS #329

andygrove commented Oct 8, 2022

andygrove Oct 8, 2022

andygrove Oct 10, 2022

avantgardnerio Oct 10, 2022

onthebridgetonowhere Oct 12, 2022

andygrove Oct 14, 2022

onthebridgetonowhere Oct 14, 2022

onthebridgetonowhere Oct 15, 2022 •

edited

Loading

andygrove Oct 15, 2022

andygrove Oct 15, 2022

onthebridgetonowhere Oct 16, 2022

avantgardnerio left a comment

andygrove commented Oct 15, 2022

Downgrade docker-compose.yaml to version 3.3 so that we can support Ubuntu 20.04.4 LTS #329

Downgrade docker-compose.yaml to version 3.3 so that we can support Ubuntu 20.04.4 LTS #329

Conversation

andygrove commented Oct 8, 2022

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

onthebridgetonowhere Oct 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avantgardnerio left a comment

Choose a reason for hiding this comment

andygrove commented Oct 15, 2022

Downgrade `docker-compose.yaml` to version 3.3 so that we can support Ubuntu 20.04.4 LTS #329

Downgrade `docker-compose.yaml` to version 3.3 so that we can support Ubuntu 20.04.4 LTS #329

onthebridgetonowhere Oct 15, 2022 •

edited

Loading