Skip to content
This repository has been archived by the owner on Feb 1, 2021. It is now read-only.

Don't forward _ping requests to the primary #2554

Closed
wants to merge 1 commit into from

Conversation

wsong
Copy link
Contributor

@wsong wsong commented Nov 28, 2016

No description provided.

@wsong
Copy link
Contributor Author

wsong commented Nov 28, 2016

In #2492, we noticed that the /_ping endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info requests to get redirected straight to the primary. This PR ensures that /_ping requests are still entirely handled by the local node while still checking the primary's status on replicas.

@nishanttotla nishanttotla added this to the 1.2.6 milestone Nov 28, 2016
@dongluochen
Copy link
Contributor

we noticed that the /_ping endpoint forwards to the primary now, which creates an HTTP connection hijacking which causes subsequent /info requests to get redirected straight to the primary

Requests should be independent of each other. If this is the case we have a bug. Can you provide the reproduce sequence for this?

@wsong
Copy link
Contributor Author

wsong commented Nov 29, 2016

There's more discussion in #2526. We never fully figured out a solid repro for this issue, but both me and @michelvocks saw this issue.

@michelvocks
Copy link
Contributor

michelvocks commented Nov 30, 2016

@dongluochen @wsong You are right. This is a bug. I spent more time on investigating the issue and the root of it is this commit: docker/engine-api@5dd6452

The client is reusing the same connection. That means it first sends the _ping request and then uses the same connection to send the /info request. The hijacking method in swarm looks like this:

cp := func(dst io.Writer, src io.Reader, chDone chan struct{}) {
		io.Copy(dst, src)
		if conn, ok := dst.(interface {
			CloseWrite() error
		}); ok {
			conn.CloseWrite()
		}
		close(chDone)
	}
	inDone := make(chan struct{})
	outDone := make(chan struct{})
	go cp(d, nc, inDone)
	go cp(nc, d, outDone)

The go-routine reads the whole stream and then forwards it to the primary docker swarm manager and because we are using the same connection for the _ping request as well as the /info request it will forward both requests. I'm now not quite sure what's the right way to go when looking at the design. Should this be fixed in the docker client or at docker swarm?

@wsong wsong force-pushed the improve_ping_again branch from 71c709d to ff26600 Compare November 30, 2016 21:38
@wsong
Copy link
Contributor Author

wsong commented Nov 30, 2016

Rebased on top of #2492

@wsong wsong force-pushed the improve_ping_again branch from ff26600 to e15bc28 Compare November 30, 2016 23:00
@dongluochen dongluochen self-assigned this Dec 1, 2016
@nishanttotla
Copy link
Contributor

@wsong was there further discussion about this PR? Have you faced the issue lately?

@wsong
Copy link
Contributor Author

wsong commented Jan 17, 2017

This isn't so much an issue as it is a feature request; because Swarm replicas can occasionally lose their connection to the primary, we want to be able to hit an endpoint telling us whether or not there is currently an elected primary, from the perspective of this replica. This is useful when, for instance, you add a new Swarm manager and it takes a while to connect to the KV store; we can hit _ping in a retry loop to determine when the new manager is ready.

@wsong
Copy link
Contributor Author

wsong commented Mar 19, 2018

This is no longer relevant now that we block in api/replica.go until we have an elected primary cluster manager.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants