-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nomad fails to deregister services with complex tags #3620
Comments
I've narrowed the issue down to any tag containing Consul prior to v1.0.0 has the same behavior but does not error on GET deregistration requests, so Nomad would just try to deregister an invalid service (due to the missing URL encoding is definitely the answer. Working on a fix. |
tl;dr - We need to path escape service IDs, but I'm not sure we can do it in a backward compatible way. Did some more digging and discovered the root cause: Go's You can download and I was all ready to file a bug with Go proper to stop normalizing
If I'm parsing the spec correctly it's totally appropriate to elide empty path segments (the empty string between two slashes), and therefore the second slash would get dropped just as Go is doing. Back to the escaping option! |
If the goal of Nomad passing these tags through to Consul is to make them available in DNS queries, maybe Nomad should enforce RFC-1035? Consul accepts tags that don't confirm to the spec but emits a warning that DNS functionality won't work. |
@preetapan Tags are used to configure Fabio routing rules, so we can't really introduce any restrictions on their format. Nomad does control the Service ID format. The problem is that we tack a concatenated list of tags onto the ID verbatim to ensure ID uniqueness. Even with proper escaping you can easily end up with unreasonably large Service IDs if you're using Fabio, so we should probably consider changing our Service ID format to use a hash of tags. |
I like the idea of using a hash of the tags in the service ID instead of tacking them on verbatim. Services could have a large number and variety of tags for as-yet unknown reasons. Using a hash future-proofs it against the unknown. |
@ctlajoie We determined that having two slashes, even with URL encoding causes a redirect due to the default behavior of ServerMux like @schmichael mentions above, combined with the fact that the request path in the url is already decoded when it sees This means that besides the fix in PR#3632 to generate a hashed id to prevent this in the future, you will also have to shut down the agent and manually delete the service file that contains the tags mentioned above. You can find the service file in a subdirectory named We are sorry for the inconvenience, let us know if you were able to manually delete the file. |
@preetapan Thanks! No worries about the inconvenience. |
Fixes #3620 Previously we concatenated tags into task service IDs. This could break deregistration of tag names that contained double //s like some Fabio tags. This change breaks service ID backward compatibility so on upgrade all users services and checks will be removed and re-added with new IDs. This change has the side effect of including all service fields in the ID's hash, so we no longer have to track PortLabel and AddressMode changes independently.
Fixes #3620 Previously we concatenated tags into task service IDs. This could break deregistration of tag names that contained double //s like some Fabio tags. This change breaks service ID backward compatibility so on upgrade all users services and checks will be removed and re-added with new IDs. This change has the side effect of including all service fields in the ID's hash, so we no longer have to track PortLabel and AddressMode changes independently.
@preetapan Just out of curiosity: nomad is interpreting the tags? |
@magiconair nomad only uses the tags to generate a unique service ID. For example the service ID for my example above would be (before the fix) After the fix, service IDs look more like this: |
@ctlajoie is correct (although in 0.7.1-rc1 we lowercased the service IDs. Just a UX adjustment. No functional change) |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Nomad v0.7.0
Consul v1.0.1 (likely relevant)
Operating system and Environment details
CentOS 7.4
Issue
Nomad is unable to deregister services that have complex tags, such as those used by fabio.
Nomad actually doesn't produce any relevant log messages for some reason, but consul does, and I included that below.
Reproduction steps
Run the job below, and then try to stop it. Nomad fails to deregister the service. This apparently blocks all other service registrations as well.
Job file (if appropriate)
Nomad Client logs (if appropriate)
What I find odd about the log line above is that it is a GET request, not a PUT which is required by the
/agent/service/deregister
endpoint. Also I just now noticed that the double-slash afterhttps:
became a single slash. Maybe the service ID needs to be URL encoded?The text was updated successfully, but these errors were encountered: