SRV records for stopped instances don't always get removed #21

aaronbuchanan · 2017-07-28T16:36:33Z

We were getting intermittent 502 errors following deployments, upon investigation we found SRV records existed for tasks that had been stopped. Curious if anyone has seen similar issues with this approach? Is there a best practice for keeping the SRV records healthy?

The text was updated successfully, but these errors were encountered:

jogster · 2017-07-31T09:30:56Z

I found the same. This is because the ecssd_agent is listening to events on the docker port to register/deregister the Route53 SRV records.

Are you running the healthcheck lamda? That should be able to detect dead SRV record entries and remove them automatically.

wprater · 2017-08-22T21:24:22Z

Are you running the healthcheck lamda? That should be able to detect dead SRV record entries and remove them automatically.

Can look into this, but I imagine there will be more lag than if the agent could listen to container events as well.

aaronbuchanan · 2017-08-22T23:09:25Z

Hi @jogster is this healthcheck lambda configuration defined anywhere? Route 53's health checks don't seem to supports SRV records (or any multi-answer DNS lookups)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SRV records for stopped instances don't always get removed #21

SRV records for stopped instances don't always get removed #21

aaronbuchanan commented Jul 28, 2017

jogster commented Jul 31, 2017

wprater commented Aug 22, 2017

aaronbuchanan commented Aug 22, 2017

SRV records for stopped instances don't always get removed #21

SRV records for stopped instances don't always get removed #21

Comments

aaronbuchanan commented Jul 28, 2017

jogster commented Jul 31, 2017

wprater commented Aug 22, 2017

aaronbuchanan commented Aug 22, 2017