-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
M3DB client should not reject multiple blocks from a peer with the same blockstart #1640
Comments
@robskillington This is the issue we were talking about. Let me know if this makes sense, seems like an easy fix if we're right |
Correct yeah, this defensive code did not take into account we would have multiple buffers. Easy fix would be to let it actually merge yes. |
I was looking through some of the server-side code because I was working on the repair stuff and now that I'm looking at that, I'm wondering if maybe we need to make a change there. The code is from func (r Reader) fetchBlocksWithBlocksMapAndBuffer(
ctx context.Context,
starts []time.Time,
seriesBlocks block.DatabaseSeriesBlocks,
seriesBuffer databaseBuffer,
nsCtx namespace.Context,
) ([]block.FetchBlockResult, error) {
var (
// TODO(r): pool these results arrays
res = make([]block.FetchBlockResult, 0, len(starts))
cachePolicy = r.opts.CachePolicy()
// NB(r): Always use nil for OnRetrieveBlock so we don't cache the
// series after fetching it from disk, the fetch blocks API is called
// during streaming so to cache it in memory would mean we would
// eventually cache all series in memory when we stream results to a
// peer.
onRetrieve block.OnRetrieveBlock
)
for _, start := range starts {
if seriesBlocks != nil {
if b, exists := seriesBlocks.BlockAt(start); exists {
streamedBlock, err := b.Stream(ctx)
if err != nil {
r := block.NewFetchBlockResult(start, nil,
fmt.Errorf("unable to retrieve block stream for series %s time %v: %v",
r.id.String(), start, err))
res = append(res, r)
}
if streamedBlock.IsNotEmpty() {
b := []xio.BlockReader{streamedBlock}
r := block.NewFetchBlockResult(start, b, nil)
res = append(res, r)
}
continue
}
}
switch {
case cachePolicy == CacheAll:
// No-op, block metadata should have been in-memory
case r.retriever != nil:
// Try to stream from disk
if r.retriever.IsBlockRetrievable(start) {
streamedBlock, err := r.retriever.Stream(ctx, r.id, start, onRetrieve, nsCtx)
if err != nil {
r := block.NewFetchBlockResult(start, nil,
fmt.Errorf("unable to retrieve block stream for series %s time %v: %v",
r.id.String(), start, err))
res = append(res, r)
}
if streamedBlock.IsNotEmpty() {
b := []xio.BlockReader{streamedBlock}
r := block.NewFetchBlockResult(start, b, nil)
res = append(res, r)
}
}
}
}
if seriesBuffer != nil && !seriesBuffer.IsEmpty() {
bufferResults := seriesBuffer.FetchBlocks(ctx, starts, nsCtx)
res = append(res, bufferResults...)
}
block.SortFetchBlockResultByTimeAscending(res)
return res, nil
} Now that I've been reading through some of this code path on the client and server side more carefully, I wonder if the issue is actually that in fact only one block should be returned, but that block should have multiple streams in it. For example if the code linked above finds data both in the buffer and on disk (which can happen right now after a flush but before the eviction triggered by the tick, and will happen more often when the out-of-order writes stuff lands) we'll return two separate Can one of you look through that and tell me if I'm right and if not explain to me why? Just want to make sure I understand this correctly. |
Right now the
streamBlocksBatchFromPeer
will reject responses from peers that return more than one block for a given blockStart. I assume this exists for historical reasons, but it makes it so that peer bootstrapping sometimes gets stuck trying to make progress until its peers stop returning multiple blocks (which can happen after a flush when the blocks have not been evicted from memory yet by the tick so they exist in the buffer and on disk or in the near future when we have out of order writes).It seems to me like we can just delete the following code from
streamBlocksBatchFromPeer
insession.go
then the merging logic in
addBlockFromPeer
should just work itself out:The text was updated successfully, but these errors were encountered: