Shard clocks to avoid write bottlenecks #1243

adamcfraser · 2015-10-27T19:01:29Z

Based on latest perf test runs against distributed_index_1140, between 55% and 60% of batch processing time is spent updating channel clocks and the stable sequence clock:

The cas retry rate is also high for clock updates - for 8195 attempted updates of the stable sequence, there were 6991 retries.

Sharding the channel clocks will improve write throughput, but with some cost on the read side to rebuild the clocks. Based on the current write contention (which are based on only three SG nodes), we should see an overall benefit when the clocks are sharded.

adamcfraser · 2015-10-30T16:33:49Z

Completed for stable clock on feature/distributed_index_1243.

With the change, P95 TimeToSubscriberInteractive for 20 minute 1K/1K run is reduced from ~19s to ~11s.

Based on expvars, the average time taken to update the stable clock doesn't seem to have been reduced significantly, but the variance has been brought down, as we avoid the intermittent long CAS retry loops. This seems reasonable - we're updating multiple kv entries for the clock now (instead of one), so the average time without contention is going to be higher. With high write contention, though, updating multiple docs becomes more efficient.

We should see even more benefit as we scale out the number of SG writer nodes. Each node is updating fewer clock shards, and we won't be impacted by the increased writer contention.

Putting up a PR for this change, and then will file a new ticket to evaluate uptake by the channel clocks.

adamcfraser added the distributed-index label Oct 27, 2015

adamcfraser added this to the 1.2.0 milestone Oct 27, 2015

adamcfraser self-assigned this Oct 27, 2015

adamcfraser added enhancement in progress labels Oct 27, 2015

adamcfraser mentioned this issue Oct 30, 2015

Add support for sharded clock management #1251

Merged

adamcfraser added review and removed in progress labels Oct 30, 2015

adamcfraser mentioned this issue Oct 30, 2015

Uptake clock sharding for channel clocks #1252

Closed

adamcfraser closed this as completed Nov 6, 2015

adamcfraser removed the review label Nov 6, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shard clocks to avoid write bottlenecks #1243

Shard clocks to avoid write bottlenecks #1243

adamcfraser commented Oct 27, 2015

adamcfraser commented Oct 30, 2015

Shard clocks to avoid write bottlenecks #1243

Shard clocks to avoid write bottlenecks #1243

Comments

adamcfraser commented Oct 27, 2015

adamcfraser commented Oct 30, 2015