Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No port for job queue consumer #376

Merged
merged 3 commits into from
Sep 17, 2020

Conversation

kierenevans
Copy link
Contributor

Fixes a readinessProbe failure, as the only processes inside are:

root@akeneo-job-queue-consumer-5f74f468dd-nmt6h:/app# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.2  26764 17756 ?        Ss   Aug20   0:18 /usr/bin/python2 /usr/bin/supervisord -c /etc/supervisor/supervisord.conf -n
www-data     9  0.0  0.9 293088 81372 ?        S    Aug20   0:06 php /app/bin/console akeneo:batch:job-queue-consumer-daemon
www-data    10  0.0  1.0 293088 81988 ?        S    Aug20   0:07 php /app/bin/console akeneo:batch:job-queue-consumer-daemon
www-data    11  0.0  1.0 293084 81884 ?        S    Aug20   0:07 php /app/bin/console akeneo:batch:job-queue-consumer-daemon
www-data    12  0.0  1.0 293088 82008 ?        S    Aug20   0:07 php /app/bin/console akeneo:batch:job-queue-consumer-daemon
www-data    13  0.0  0.9 293088 81440 ?        S    Aug20   0:06 php /app/bin/console akeneo:batch:job-queue-consumer-daemon
root        44  0.0  0.0   4000  3076 pts/0    Ss   10:21   0:00 bash
root       336  0.0  0.0   7640  2636 pts/0    R+   10:21   0:00 ps aux

@kierenevans kierenevans added bug Something isn't working harness-akeneo Akeneo harness labels Aug 21, 2020
@kierenevans kierenevans added this to the 0.10.0 milestone Aug 21, 2020
@andytson-inviqa andytson-inviqa dismissed their stale review August 21, 2020 15:48

not a firm approval given how we did it in other services

Copy link
Contributor

@andytson-inviqa andytson-inviqa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be better to make this consistent with existing non-http containers like app-init/migrate/console, which use a command

@kierenevans
Copy link
Contributor Author

I disagree that a pod should restart if the database is down, that's cascading the failure, but we can make it test if the process we wish to run is running.

@andytson-inviqa
Copy link
Contributor

andytson-inviqa commented Sep 17, 2020

I disagree that a pod should restart if the database is down, that's cascading the failure, but we can make it test if the process we wish to run is running.

it may be for other reasons than a database being down, but instead a node network link failure, or db credentials incorrect. We treat normal apps as unhealthy when their connections are down not just because of the experience to users

the other containers I mention do this, so your opinion is with respect to them also and you can change the state command as another PR

@andytson-inviqa
Copy link
Contributor

andytson-inviqa commented Sep 17, 2020

but either way I think it should be something reportable rather than just logging only when a worker job has run, as there are more cases of db connection failure than just the db itself being down

@andytson-inviqa
Copy link
Contributor

andytson-inviqa commented Sep 17, 2020

I don't think it's worth checking the process we're running is running, as docker/tini should shutdown the container/pod if it exits (unless you can think of other cases?)

@kierenevans
Copy link
Contributor Author

kierenevans commented Sep 17, 2020

Forgot this was just a readiness probe so it won't get restarted upon a failure of the readinessProbe, just put to unready.

This container gets no traffic via a service, so the status of being ready or not does not matter.

Doesn't make sense to check for processes in a readinessProbe, so will convert to check if database is available as recommended.

Supervisor still in use for this image as it's based on php-fpm, so any application crashes will get restarted.

What we are missing is any kind of consequence for crashing more than X number of times in a row, which would get the pod crashing and rescheduled elsewhere, e.g. https://github.com/continuouspipe/dockerfiles/blob/master/ubuntu/16.04/etc/supervisor/conf.d/kill_supervisord_upon_fatal_process_state.conf + https://github.com/continuouspipe/dockerfiles/blob/master/ubuntu/16.04/usr/local/share/supervisord/kill_supervisord_upon_fatal_process_state.py

@kierenevans kierenevans force-pushed the feature/no-ports-for-akeneo-job-queue-consumer branch from 68147ac to 6732c07 Compare September 17, 2020 12:12
initialDelaySeconds: 5
exec:
command:
- /bin/readiness.sh
Copy link
Contributor

@andytson-inviqa andytson-inviqa Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I had thought the app/sidekick stuff (so app state) was copied into php-fpm, but seems not

@kierenevans
Copy link
Contributor Author

Oops, bash: mysqladmin: command not found

@kierenevans kierenevans merged commit c4100e0 into 0.10.x Sep 17, 2020
@kierenevans kierenevans deleted the feature/no-ports-for-akeneo-job-queue-consumer branch September 17, 2020 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working harness-akeneo Akeneo harness
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants