[WIP] performance improvements for high cardinality #7

e-dard · 2017-02-08T15:02:34Z

influx-stress currently pre-allocates all series and points when starting up, which makes running high cardinality loads nigh on impossible locally.

The PR helps somewhat with this problem by only generating batches of the series and points we need as and when needed.

Series are generated in batches of 100,000, while line-protocol points are generated in batches of 500,000.

We may want to tweak these numbers; I haven't tested. However previously, writing 100M series ate up over 18GB of RAM, while doing it this way seems to consume a few hundred.

I'm not sure if it's impacted the maximum throughput significantly yet however.

desa · 2017-02-08T16:11:32Z

point/point.go

+
+	var numTags int
+	s.template, numTags = formatTemplate(tmplt)
+	s.tagCardinalities = tagCardinalityPartition(numTags, primeFactorization(cardinality))


Is this call not problematic with 100M series?

Don't pre-allocate all series/points

d218a29

desa reviewed Feb 8, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] performance improvements for high cardinality #7

[WIP] performance improvements for high cardinality #7

e-dard commented Feb 8, 2017

desa Feb 8, 2017

[WIP] performance improvements for high cardinality #7

Are you sure you want to change the base?

[WIP] performance improvements for high cardinality #7

Conversation

e-dard commented Feb 8, 2017

desa Feb 8, 2017

Choose a reason for hiding this comment