Skip to content
This repository has been archived by the owner on Dec 4, 2024. It is now read-only.

Add an endpoint to reload HAProxy config #312

Merged
merged 18 commits into from
Sep 29, 2016

Conversation

JayH5
Copy link
Contributor

@JayH5 JayH5 commented Sep 19, 2016

EDIT:
This PR adds the ability to trigger a reload manually by sending a SIGHUP or SIGUSR1 to the marathon-lb script. It also adds HAProxy Lua extensions so that those signals can be sent to marathon-lb using two HTTP endpoints.

Old comment:
This PR adds the ability to trigger a reload manually via a very (very) basic HTTP API.

Making a POST /reload to marathon-lb will trigger a full reload, as though a relevant event was received from Marathon.

You can also do POST /reload?existing=true to trigger a reload of the existing config, without fetching the app data from Marathon.

I think this API would generally be useful from a debugging standpoint -- if an event from Marathon is missed or something it might be useful to be able to poke marathon-lb to get a reload without restarting the whole container/service.

Our actual use case for this uses the ?existing=true endpoint. In some cases the HAProxy config text hasn't changed but we still want to reload HAProxy because something else has changed -- most notably an SSL certificate could have been renewed or a certificate file added to a directory that HAProxy loads. We're trying to build an automated system to issue certificates that would then ping marathon-lb to load the new certs.

I've tried to add this with as little disruption as possible. So I'm starting with the precedent set by the event server and just using the facilities in the wsgiref module. Also, there aren't really tests for the existing event server and the infrastructure needed to test all this is.. well.. there would be quite a lot of it. So there aren't really tests for my addition. I've tested it locally and it works as I'd expect 😕.

I've also only added this functionality when in SSE mode, as the event server mode is deprecated and I didn't want to have 2 different listening ports for marathon-lb.

There are a couple of areas where things are a bit rough, mostly due to the existing structure of marathon-lb (the event processor has a thread and is notified to do reloads). The main issues I have are:

  • The reload endpoint is really just a trigger. It returns 200/OK immediately and just starts the procedure of reloading.
  • It's not super clear who "owns" the event processor. With this API, both the event stream and the API server need to have references to the processor in order to trigger reloads. They both call processor.stop() if something goes wrong, in their finally clauses. I'm not sure how best to deal with this.

Anyway, this is a first prototype so feedback is appreciated :)

@mesosphere-ci
Copy link

Can one of the admins verify this patch?

@brndnmtthws
Copy link
Contributor

test this please

@JayH5
Copy link
Contributor Author

JayH5 commented Sep 19, 2016

Tests seem to be failing due to my changes in #308 finally taking effect. Jenkins isn't installing requirements-dev.txt.

@brndnmtthws
Copy link
Contributor

I'm generally onboard with this, but could we do it in such a way that we don't add a new endpoint to the python script? It would be great if we could do it thru HAProxy. How about we have a lua script endpoint which sends a SIGHUP to the python script?

@JayH5
Copy link
Contributor Author

JayH5 commented Sep 22, 2016

@brndnmtthws I did have that thought too but initially thought it would be too complicated.

I had a go at implementing signal handling now and it seems better than I had expected :)

So now there are two signals:

  • SIGHUP (:9090/_mlb_signal/hup): Reload config completely as though there was an event from Marathon.
  • SIGUSR1 (:9090/_mlb_signal/usr1): Reload existing config (i.e. restart HAProxy).

I've left the HTTP endpoints as just accepting any HTTP method. These endpoints are a little different to the others in that they are "mutating" in a way, so perhaps they should be limited to POST or PUT.. but the existing endpoints aren't limited to certain methods and HAProxy isn't really meant to be used to build real APIs...

I made the API endpoints quite explicit -- /_mlb_signal/hup instead of something like /_mlb_reload so that it's very clear about what's going on and that it's just a signal being fired (so no claims are made about the outcome of the signal).

As usual, tested locally but can't promise more than that 😜

The git log is a bit of a mess now.. let me know if I should do a rebase.

@brndnmtthws
Copy link
Contributor

test this please

end

core.register_service("signalmlbhup", "http", function(applet)
local _, success, code = run("pkill -HUP -f '^python.*marathon_lb.py'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One concern here is what happens if they're running multiple MLBs within the same PID namespace.

end)

core.register_service("signalmlbusr1", "http", function(applet)
local _, success, code = run("pkill -USR1 -f '^python.*marathon_lb.py'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

@brndnmtthws
Copy link
Contributor

What if we had MLB pass its own PID as an env var to HAProxy?

@brndnmtthws
Copy link
Contributor

Actually I suppose it may be safe to ignore, since we already assume MLB is running within a namespaced container elsewhere in the code. I suppose we could just make that a requirement, which simplifies things.

applet:send(response)
end

core.register_service("signalmlbhup", "http", function(applet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should limit these endpoints to POST requests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how best to do this with HAProxy.. can do something like this (I think? HAProxy manual is long and complicated):

  acl signalmlbhup path /_mlb_signal/hup
  http-request deny 405 if !METH_POST signalmlbhup
  http-request use-service lua.signalmlbhup if signalmlbhup
  acl signalmlbusr1 path /_mlb_signal/usr1
  http-request deny 405 if !METH_POST signalmlbusr1
  http-request use-service lua.signalmlbusr1 if signalmlbusr1

Like I said in this comment, HAProxy is not ideal for doing this, and the other endpoints don't limit their valid methods.

@JayH5
Copy link
Contributor Author

JayH5 commented Sep 26, 2016

@brndnmtthws yeah I don't feel that pkill is the most elegant solution... but there are already commands like pidof haproxy being run in getpids.lua. I would like a nice way to pass PIDs around (maybe PID files?) but this seemed like the simplest solution.

@brndnmtthws
Copy link
Contributor

Would you mind adding a note to the README under the 'operation best practices' section stating that MLB should be run inside a container with namespace isolation?

@JayH5
Copy link
Contributor Author

JayH5 commented Sep 27, 2016

@brndnmtthws I added a README note.. let me know what you think about limiting the APIs to POST-only.

@brndnmtthws
Copy link
Contributor

test this please

@brndnmtthws brndnmtthws merged commit 54935d9 into d2iq-archive:master Sep 29, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants