-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement (opt-in) WriteBatchRawV2 that can batch across namespaces #1974
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1974 +/- ##
=========================================
+ Coverage 63.4% 63.4% +<.1%
=========================================
Files 1119 1119
Lines 105570 105968 +398
=========================================
+ Hits 66969 67253 +284
- Misses 34315 34402 +87
- Partials 4286 4313 +27
Continue to review full report at Codecov.
|
f2b5d09
to
65bf84a
Compare
|
||
// UseV2BatchAPIs determines whether the V2 batch APIs are used. Note that the M3DB nodes must | ||
// have support for the V2 APIs in order for this feature to be used. | ||
UseV2BatchAPIs *bool `yaml:"useV2BatchAPIs"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this in the long term, is this sustainable? The config might be littered with different versions for different sets of APIs and it could get very messy. How about just a single APIVersion *string
where users will put in "0.2.3"
or something similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan is to deprecate the old APIs soon so I'm not super worried about it but I can make it a string. I'll probably keep it a bool in the guts of the codebase though just to keep things simple
struct WriteBatchRawV2RequestElement { | ||
1: required binary id | ||
2: required Datapoint datapoint | ||
3: required i64 nameSpace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that this might save a ton of memory, but might be awkward to use from a user perspective. I suppose there can be a thin wrapper around this interface to make this easier.
Why not go all out and have list<binary> ids
at the request level and required i64 id
here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No one will interact with this directly, it all is handled transparently by the client so I think thats fine.
I've spoken with Rob before about the IDs thing and its pretty uncommon to have multiple data points for the same ID in one request so I'm gonna leave that out for now for simplicity
struct WriteTaggedBatchRawRequestElement { | ||
1: required binary id | ||
2: required binary encodedTags | ||
3: required Datapoint datapoint | ||
} | ||
|
||
struct WriteTaggedBatchRawV2RequestElement { | ||
1: required binary id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment regarding list of ids at the request level here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same response
writeOpBatchSize tally.Histogram | ||
fetchOpBatchSize tally.Histogram | ||
status status | ||
serverSupportsV2APIs bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment to above regarding versioning. Imagining a serverSupportsV3APIs
and serverSupportsV4APIs
tag here later on is quite painful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for this portion I'll leave as is as a bool because with an enum you need to handle the case where its an invalid value. Hopefully we can just delete this code soon
} | ||
|
||
seriesID := s.newPooledID(ctx, elem.ID, pooledReq) | ||
batchWriter.Add( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can batchWriter
be nil at this stage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I don't think so. It starts off as nil and it will get set in the first iteration of the loop. Everytime after that where it gets set to nil (caus a batch was written) we assign a new one
} | ||
|
||
seriesID := s.newPooledID(ctx, elem.ID, pooledReq) | ||
batchWriter.AddTagged( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can batchWriter
be nil at this stage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment
ae3c47d
to
04bc245
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
For workloads that involve multiple namespaces the existing implementation can lead to a lot of extraneous RPC which leads to increased load / instability of the M3DB nodes. This P.R adds the ability to opt-in on the client (when working with versions of M3DB that support the new APIs) to use a new API that batches writes across namespaces transparently leading to improved performance.