-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify MultiSnapshot#SeqNoset #27547
Conversation
Today, we maintain two sets in a SeqNoSet: ongoing sets and completed sets. We can remove the completed sets by releasing the internal bitset of a CountedBitSet when all its bits are set. Relates elastic#27268
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The basics look good. Left one comment suggestion a slight variant on the approach
@@ -99,60 +100,62 @@ public void close() throws IOException { | |||
|
|||
boolean getAndSet(int index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the getAndSet semantics any more. We can just have set and if people want to pre-get, they can. This way this can inherit from BitSet and be used as a drop-in replacement for it (although we can implement the clear() other methods, I think it's fine to throw an UnsupportedOperationException until we need them)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way this can inherit from BitSet and be used as a drop-in replacement for it.
I agree. I move this class to the common package and implement methods from BitSet.
I don't think we need the getAndSet semantics any more. We can just have set and if people want to pre-get, they can.
The method getAndSet
is optimized to avoid looking an entry twice, but I am fine to remove it.
As far as i can tell all current usages are not performance sensitive so I think we should go with simplicity.
… On 30 Nov 2017, at 10:27 PM, Nhat Nguyen ***@***.***> wrote:
@dnhatn commented on this pull request.
In core/src/main/java/org/elasticsearch/index/translog/MultiSnapshot.java:
> @@ -99,60 +100,62 @@ public void close() throws IOException {
boolean getAndSet(int index) {
This way this can inherit from BitSet and be used as a drop-in replacement for it.
I agree. I move this class to the common package and implement methods from BitSet.
I don't think we need the getAndSet semantics any more. We can just have set and if people want to pre-get, they can.
The method getAndSet is optimized to avoid looking an entry twice, but I am fine to remove it.
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@bleskes, I have addressed your suggestion. Could you please take another look? Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a question.
* when all bits are set to reduce memory usage. This structure can work well for sequence numbers | ||
* from translog as these numbers are likely to form contiguous ranges (eg. filling all bits). | ||
*/ | ||
public final class CountedBitSet extends BitSet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only has one consumer, does it really need to be in a common package? What's the motivation for moving this out of the translog package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Boaz suggested to make it as a drop-in replacement for BitSet
. I thought about the LocalCheckpointTracker
and moved it to the common package. I am happy to move it back to the translog package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, the description of the PR is "Move CountedBitSet to the common package" yet it appears the changes are larger than that; it's best keep the summary and the changes aligned. Also, I think moving and changing the implementation should be separate PRs and I think the reason for the move should be explained in the commit message.
With that out of the way: if the intention is to reuse this in the local checkpoint tracker I still do not think these needs to be in common, it can be in the sequence number package org.elasticseach.index.seqno
since the only uses relate to sequence numbers. It's a shame this class has to be public in that package even.
My resistance is that the common package is already a bit of a dumping ground and at least in the sequence number package it's slightly clearer than it's an internal implementation rather than for public consumption (e.g., in plugins). When we modularize (JDK modules) we will be in a better position to make clear what is part of the public API and what is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, the description of the PR is "Move CountedBitSet to the common package" yet it appears the changes are larger than that; it's best keep the summary and the changes aligned.
My bad, I should have updated the description more carefully when addressing feedbacks.
Also, I think moving and changing the implementation should be separate PRs
I agree. I will move this class back to the translog
package in this PR. If we agree to use it in the LocalCheckpointTracker
, we can make another PR to move it the seqno
package later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 0df0474
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a minor suggestion. LGTM.
|
||
@Override | ||
public int length() { | ||
return bitset == null ? onBits : bitset.length(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smart
} | ||
final int index = Math.toIntExact(value % BIT_SET_SIZE); | ||
final boolean wasOn = bitset.get(index); | ||
bitset.set(index); | ||
return wasOn; | ||
} | ||
|
||
// For testing | ||
long completeSetsSize() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we abstract the underlying storage away, I don't think we care about whether underlying sets are completed or not. All we care about is functionality. I think we can remove completedSetSize and ongoingSetSize and their tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 14aca62
final CountedBitSet countedBitSet = new CountedBitSet((short) numBits); | ||
|
||
final int iterations = iterations(1000, 20000); | ||
for (int i = 0; i < iterations; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a great tests but I think we can simplify it by randomly setting keys and check at the end that the result is the same for all position (it will also show nothing change for keys after they were set).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 0e6a31e
retest this please. |
Test failure is unrelated. The last commits look good to me. Let's give @jasontedor some time to respond whether the expected usage in LocalCheckpointTracker is justifying putting this in common. |
Thanks @bleskes, I responded inline where the discussion is taking place. |
Thanks @bleskes and @jasontedor for the reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few observations.
private FixedBitSet bitset; | ||
|
||
public CountedBitSet(short numBits) { | ||
assert numBits > 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a hard illegal argument exception.
|
||
@Override | ||
public long ramBytesUsed() { | ||
throw new UnsupportedOperationException("Not implemented yet"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need an implementation for this method however I think this could be RamUsageEstimator.shallowSizeOfInstance(CountedBitSet.class) + (bitset == null ? 0 : bitset.ramBytesUsed());
. You could even fold RamUsageEstimator.shallowSizeOfInstance(CountedBitSet.class)
into a static final constant.
|
||
@Override | ||
public void clear(int startIndex, int endIndex) { | ||
throw new UnsupportedOperationException("Not implemented yet"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think "Not implemented yet" adds anything other the exception type (and could be misleading if we never intend to implement).
throw new UnsupportedOperationException("Not implemented yet"); | ||
} | ||
|
||
// Exposed for testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think these comments do not add anything over the fact that we can use the IDE to see where a method is used. If it used in tests and only tests then we already know it is exposed for tests.
private FixedBitSet bitset; | ||
|
||
CountedBitSet(short numBits) { | ||
assert numBits > 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a hard illegal argument exception?
public boolean get(int index) { | ||
assert 0 <= index && index < this.length(); | ||
assert bitset == null || onBits < bitset.length() : "Bitset should be released when all bits are set"; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This empty line can go.
|
||
// Ignore set when bitset is full. | ||
if (bitset != null) { | ||
boolean wasOn = bitset.getAndSet(index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can (and should) be final.
Sorry @jasontedor, I missed your comments. |
This commit addresses the missed comments from elastic#27547.
Today, we maintain two sets in a SeqNoSet: ongoing sets and completed sets. We can remove the completed sets and use only the ongoing sets by releasing the internal bitset of a CountedBitSet when all its bits are set. This behaves like two sets but simpler. This commit also makes CountedBitSet as a drop-in replacement for BitSet. Relates #27268
This commit addresses the missed comments from #27547.
This commit addresses the missed comments from #27547.
Today, we maintain two sets in a
SeqNoSet
: ongoing sets and completed sets. We can remove the completed sets and use only the ongoing sets by releasing the internal bitset of aCountedBitSet
when all its bits are set. This behaves like two sets but simpler. This commit also makesCountedBitSet
as a drop-in replacement forBitSet
.Relates #27268