-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APP-1086: Fix typing, use BiConsumer subclass for iteration #267
base: csd-2.3
Are you sure you want to change the base?
Conversation
Ok, I ran the tests against LevelDB. The original code which uses the following code in the inner loop of stage deletion: Pulling all records for each stage and post-filtering takes an enormous time -- about 5s. We need to pull the deletion code out of the stage loop if we want performance for both LevelDB and InMemory stores. Tracking deleted stages and then deleting all executorStages using the countingRemoveIf is slightly faster than the original code at about 95ms. scaling for the original and final replacement code is essentially linear: For the final repalcement code (countingRemoveIf), the InMemory solution is roughly 100 times faster than the LevelDB timing -- 120ms at the 100x size. |
I was considering this code to be more part of the for-Apache polishing than the for-CSD bugfix, so I was putting off reviewing and merging. From your new test results, I'm concluding that there is no reason not to create a for-Apache PR from essentially what we have after a final review and any last polishing. The LevelDB branch would still benefit from some changes at that point, but we don't have to tackle those immediately beyond noting in the JIRA the advisability of further work. |
Rewrote removeAll to be faster than countingRemoveIf even when using the original indices (yay!). |
BTW, I assume we don't want to check-in PerfSuite, which is why I'm keeping those commits separated. |
private ConcurrentMap<Class<?>, InstanceList<?>> data = new ConcurrentHashMap<>(); | ||
|
||
@SuppressWarnings("unchecked") | ||
public <T> InstanceList<T> get(Class<T> type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that there is a lot of questionable precedent exerting influence, but what value is there in get(type)
vs. a more expected get(key)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to follow precedent. 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting to key.
try { | ||
return new InstanceList<>(key); | ||
} catch (Exception e) { | ||
throw Throwables.propagate(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 of 8 uses of get() call Throwables.propagate (counting this new one).
I think the best thing to do here is to change get to throw ReflectiveOperationException, and then have the code that currently calls propagate, call new RuntimeException(e) for ReflectiveOperationException.
Iterator<Map.Entry<Comparable<Object>, Object>> each = data.entrySet().iterator(); | ||
int count = 0; | ||
@SuppressWarnings("unchecked") | ||
int countingRemoveIfByKey(String index, Collection keys) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
countingRemoveIfByKeys
to match removeAllByKeys
@SuppressWarnings("unchecked") | ||
int countingRemoveIfByKey(String index, Collection keys) { | ||
KVTypeInfo.Accessor getter = ti.getAccessor(index); | ||
Predicate<? super T> filter = getPredicate(getter, keys); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider not binding getter
and just doing getPredicate(ti.getAccessor(index), keys)
try { | ||
return getter.get(value); | ||
} catch (Exception e) { | ||
throw Throwables.propagate(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cf. prior deprecation comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah -- I copied this code from elsewhere in InMemoryStore because getter.get is, afaict, everywhere used where exceptions can't be rethrown. Or something similarly as stupid :| I was not amused.
get() is defined as throwing Exception, so I can't use one of the simpler alternatives, and if I touch this code, I'm going to stop allowing exceptions being thrown out, and force the implementation to do this propagation where necessary.
@@ -79,55 +74,49 @@ public long count(Class<?> type, String index, Object indexedValue) throws Excep | |||
|
|||
@Override | |||
public <T> T read(Class<T> klass, Object naturalKey) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Class<T> klass
vs. Class<?> type
an intentional choice or an accidental consistency? Either way, I'm not sure that I am following the logic of the convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... also not entirely consistent across related files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did notice this, but I didn't follow up and look. Thinking about it before looking, I assume Class type doesn't work in scala, while Class[T] klass would. Let me see what's with that, though....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking more about naming convention, klass
vs. type
-- more about understandability than functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's 'type' because KVTypeInfo.
But KVStore seems to be mixed, and ElementTrackingStore uses klass always.
What changes were proposed in this pull request?
Move type-safety into compile-time. Remove runtime type checking.
Use a BiConsumer to consume elements in the map and count removals.
How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
Please review http://spark.apache.org/contributing.html before opening a pull request.