Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USR1 signal still spawns additional loops on plugins that use refresh #3815

Closed
mattlqx opened this issue Apr 6, 2020 · 1 comment · Fixed by #4096
Closed

USR1 signal still spawns additional loops on plugins that use refresh #3815

mattlqx opened this issue Apr 6, 2020 · 1 comment · Fixed by #4096

Comments

@mattlqx
Copy link
Contributor

mattlqx commented Apr 6, 2020

What happened:

Using the route53 plugin (others like azure and clouddns probably as well), can spawn an additional concurrent refresh loop for every USR1 signal received.

I added a debug line to the route53 updateZones function:

@@ -238,6 +242,7 @@ func updateZoneFromRRS(rrs *route53.ResourceRecordSet, z *file.Zone) error {
 func (h *Route53) updateZones(ctx context.Context) error {
 	errc := make(chan error)
 	defer close(errc)
+	fmt.Println(time.Now().String(), "running refresh loop: ", h.zones)
 	for zName, z := range h.zones {
 		go func(zName string, z []*zone) {
 			var err error

My debugging output is recorded here w/ 3 server blocks configured with the route53 plugin and configured for a 5 minute refresh interval. https://gist.github.com/mattlqx/7e8126c42a02cf6caa0982adb8fb8207

A loop is started when the server is started. I let it it refresh once normal, which happens at 5 minutes, then shortly there after I sent a USR1 signal. Another refresh occurs, and then approximately 4.5 minutes and 5 minutes after that you can see both loops refresh again.

What you expected to happen:

USR1 should only refresh configuration information and clean-up any existing loops to get external data that a plugin may have open.

How to reproduce it (as minimally and precisely as possible):

Configure a route53 plugin to get zone from Route 53.
Add a debugging print statement.
Run CoreDNS and observe the logging statements.
Send USR1 as many times as you like to observe n loops occurring in the background.

Anything else we need to know?:

There was an underlying problem with Caddy a while ago that seemed to be related to the issue. #2396 But CoreDNS now uses the fixed version of Caddy and the problem still exists. I'm not really sure how to debug/fix further. I do note that this log statement is never printed: https://github.com/coredns/coredns/blob/master/plugin/route53/route53.go#L90. So I would imagine the context's Done channel is never closing.

Environment:

  • the version of CoreDNS: 1.6.9
  • Corefile:
(cacheandlog) {
  prometheus
  cache 300
  errors
  log . {
    class error
  }
}

example.com {
  route53 example.com.:Z1GCMBPMI21YYY example.com.:Z3IZZERHBUPZZZ {
    credentials default /configs/credentials
    refresh 5m
    fallthrough
  }
  forward . 172.30.13.132 172.30.14.132
  import cacheandlog
}

16.172.in-addr.arpa {
  route53 16.172.in-addr.arpa.:Z33QPSXQCQT8W1 {
    credentials default /configs/credentials
    refresh 5m
    fallthrough
  }
  forward . 172.30.13.132 172.30.14.132
  import cacheandlog
}

30.172.in-addr.arpa {
  route53 30.172.in-addr.arpa.:Z1CYHZ3A0GBKDI {
    credentials default /configs/credentials
    refresh 5m
    fallthrough
  }
  forward . 172.30.13.132 172.30.14.132
  import cacheandlog
}
  • OS (e.g: cat /etc/os-release): PRETTY_NAME="Debian GNU/Linux 10 (buster)" (Golang 1.14.1 docker container)
@mattlqx mattlqx added the bug label Apr 6, 2020
@miekg
Copy link
Member

miekg commented Jul 9, 2020

this should indeed be fixed.

miekg added a commit that referenced this issue Aug 31, 2020
Stop the context so the refresh loop terminates on restart.

Fixes: #3815

Signed-off-by: Miek Gieben <[email protected]>
yongtang pushed a commit that referenced this issue Aug 31, 2020

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Stop the context so the refresh loop terminates on restart.

Fixes: #3815

Signed-off-by: Miek Gieben <[email protected]>
nyodas pushed a commit to DataDog/coredns that referenced this issue Oct 26, 2020

Verified

This commit was signed with the committer’s verified signature.
nyodas Bob
Stop the context so the refresh loop terminates on restart.

Fixes: coredns#3815

Signed-off-by: Miek Gieben <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants