-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
atomic.AddUint32 in UUID generation causing global lock in HTTP handler #6658
Comments
@daviesalex this seems a bit strange to me because |
I'm not sure how |
@e-dard branch jw-compact-fix, latest 0b25541 I do see your point though, its very odd - seems to only be called (through TimeUUID()) in ./services/httpd/handler.go. Its possible that we have a lot of HTTP requests coming in, although i'm pretty sure we saw this during startup. I'm trying to reproduce on a test box now... not having much joy. This test box does not have the same production HTTP load in, although it does have a similar amount of data. Its doing what i'd expect (spending time in scanblock and runtime.MSpan_Sweep). Still spending 5% of time in this atomic though. One thing I notice is that even though the DB is not started, the HTTP port is open and accepting connections. Perhaps this is simply a problem with high HTTP loads? (it does effectively add a lock to all HTTP requests in, right?) Trying to reproduce more... |
@daviesalex On that branch, does it prevent the #6652? |
@jwilder yes. No panic, but still missing data. |
Right, so here is a theory about this issue. The node I tested under is under some load from HTTP (not much, certainly compared to our UDP load, but thousands of inserts a minute). The HTTP handler starts listening immediately (so runs during startup). The % was particularly high during startup because the rest of Influx does a relatively poor job of paralleling the start, so the percentage of CPU time achievable by a HTTP handler that will spin up a goroutine for each request is high. I'm pretty sure the issue we saw is irrelevant to start time... it simply happens when we have significant HTTP in load. This basically serializes HTTP requests that can otherwise be handled in parallel (we have 80 logical cores, so this is more obvious for us than perhaps others). I think this is a legitimate thing to improve, but dont know enough about the consequences of my suggested changes (particularly #2, which is trivial to code). For us, we dont really use the HTTP handler much and loosing <1 core for this isnt a huge deal, but its silly. For other users, this may be more of a problem. I renamed the issue to reflect what I (now) think the issue is, but I could still be wrong! Thoughts? |
Closing this since it was related to 0.13/1.0. If it's still an issue in 1.2, let us know and we can re-open. |
Running a startup on HEAD with a large dataset shows clearly (in perf) that the overwhelming majority of time is spent in atomic.AddUint32. This only appears because of performance improvements in HEAD; this does not appear in 0.13 stable (which spends most of its time in runtime.mapiternext and runtime.indexbytebody). This was introduced in "Replace code.google.com/p/go-uuid with TimeUUID from gocql" (62434fb) at the end of March.
Since I have issues loosing data in HEAD, i've not been able to see if this also is a significant user of CPU once influxd is started, but it seems plausible that it will, particularly under high loads. This code is, I think, in ./uuid/uuid.go and effectively introduces a global lock around anything that wants a UUID:
This is clearly putting a global lock for a good reason (to ensure we dont get duplicate UUIDs within the same nanosecond), but this should be fairly simple to improve while either guaranteeing, or making extremely unlikely, that we have duplicate IDs.
There are various ways to improve this that occur to me; since (AFAIK) nobody cares about the time to the nanosecond for these, one way to do this is to have a bunch of goroutines producing these atomic values, sharing the nanoseconds in each millisecond between them. This would still give timestamps accurate to the ms (or a fraction of ms, whatever we wanted).
The second option would be to remove the node MAC (which isnt adding a lot), and replace it with a random number (per nanosecond), which is (from a quick read of https://www.ietf.org/rfc/rfc4122.txt allowed). This of course would make it technically possible for a duplicate value to appear, but fantastically unlikely (technically, restarting the process today with a different time could also cause this, i'm not actually sure what is more likely!).
Thoughts? This is low hanging fruit for performance optimization I think...
Tagging @jwilder FYI; I found this while testing his PR and @mattrobenolt who I think (?) wrote this code.
Raw perf data (this remains pretty much static during the whole startup). Our dataset is a few TB all on SSD, with loads of CPU cores.
The text was updated successfully, but these errors were encountered: