Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APP-1086: Fix typing, use BiConsumer subclass for iteration #267

Open
wants to merge 16 commits into
base: csd-2.3
Choose a base branch
from

Conversation

davidnavas
Copy link

What changes were proposed in this pull request?

Move type-safety into compile-time. Remove runtime type checking.
Use a BiConsumer to consume elements in the map and count removals.

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

@davidnavas davidnavas requested a review from markhamstra April 19, 2019 20:20
@davidnavas
Copy link
Author

Ok, I ran the tests against LevelDB.
Also, I somehow thought we had already merged this :|

The original code which uses the following code in the inner loop of stage deletion:
tracking.view(classOf[ExecutorStageSummaryWrapper])
.index("stage")
.first(key)
.last(key)
.asScala
.foreach { e =>
tracking.delete(e.getClass(), e.id)
}
runs quite quickly when using LevelDB as the LevelDB backend appears to not sort in order to pull the restricted subset. The average speed is about 116ms.

Pulling all records for each stage and post-filtering takes an enormous time -- about 5s. We need to pull the deletion code out of the stage loop if we want performance for both LevelDB and InMemory stores.

Tracking deleted stages and then deleting all executorStages using the countingRemoveIf is slightly faster than the original code at about 95ms.

scaling for the original and final replacement code is essentially linear:
116ms and 95ms at 1x (as above)
1.05s and 890ms at 10x
12.4s and 9.5s at 100x

For the final repalcement code (countingRemoveIf), the InMemory solution is roughly 100 times faster than the LevelDB timing -- 120ms at the 100x size.

@markhamstra
Copy link

I was considering this code to be more part of the for-Apache polishing than the for-CSD bugfix, so I was putting off reviewing and merging. From your new test results, I'm concluding that there is no reason not to create a for-Apache PR from essentially what we have after a final review and any last polishing. The LevelDB branch would still benefit from some changes at that point, but we don't have to tackle those immediately beyond noting in the JIRA the advisability of further work.

@davidnavas
Copy link
Author

Rewrote removeAll to be faster than countingRemoveIf even when using the original indices (yay!).
The Long index is 15% faster again for InMemory, but the new code is 10%-ish faster than last night anyway, and at some point, the performance is "good enough". 86ms for 100k deletes vs 75ms 🤷
The countingRemoveIf was over 100ms for 100k deletes, so it wasn't competitive anymore even for the InMemory case (it was never a good fit for LevelDB).
The LevelDB implementation has too much overhead to notice the 11ms difference (.1%) between index variants.
As a result, I removed all the older code, and I think I'd skip surfacing the countingRemoveIf variant.

@davidnavas
Copy link
Author

BTW, I assume we don't want to check-in PerfSuite, which is why I'm keeping those commits separated.

private ConcurrentMap<Class<?>, InstanceList<?>> data = new ConcurrentHashMap<>();

@SuppressWarnings("unchecked")
public <T> InstanceList<T> get(Class<T> type) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that there is a lot of questionable precedent exerting influence, but what value is there in get(type) vs. a more expected get(key)?

Copy link
Author

@davidnavas davidnavas May 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to follow precedent. 🤷

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting to key.

try {
return new InstanceList<>(key);
} catch (Exception e) {
throw Throwables.propagate(e);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 of 8 uses of get() call Throwables.propagate (counting this new one).
I think the best thing to do here is to change get to throw ReflectiveOperationException, and then have the code that currently calls propagate, call new RuntimeException(e) for ReflectiveOperationException.

Iterator<Map.Entry<Comparable<Object>, Object>> each = data.entrySet().iterator();
int count = 0;
@SuppressWarnings("unchecked")
int countingRemoveIfByKey(String index, Collection keys) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

countingRemoveIfByKeys to match removeAllByKeys

@SuppressWarnings("unchecked")
int countingRemoveIfByKey(String index, Collection keys) {
KVTypeInfo.Accessor getter = ti.getAccessor(index);
Predicate<? super T> filter = getPredicate(getter, keys);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider not binding getter and just doing getPredicate(ti.getAccessor(index), keys)

try {
return getter.get(value);
} catch (Exception e) {
throw Throwables.propagate(e);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cf. prior deprecation comment

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah -- I copied this code from elsewhere in InMemoryStore because getter.get is, afaict, everywhere used where exceptions can't be rethrown. Or something similarly as stupid :| I was not amused.
get() is defined as throwing Exception, so I can't use one of the simpler alternatives, and if I touch this code, I'm going to stop allowing exceptions being thrown out, and force the implementation to do this propagation where necessary.

@@ -79,55 +74,49 @@ public long count(Class<?> type, String index, Object indexedValue) throws Excep

@Override
public <T> T read(Class<T> klass, Object naturalKey) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Class<T> klass vs. Class<?> type an intentional choice or an accidental consistency? Either way, I'm not sure that I am following the logic of the convention.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... also not entirely consistent across related files.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did notice this, but I didn't follow up and look. Thinking about it before looking, I assume Class type doesn't work in scala, while Class[T] klass would. Let me see what's with that, though....

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more about naming convention, klass vs. type -- more about understandability than functionality.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's 'type' because KVTypeInfo.
But KVStore seems to be mixed, and ElementTrackingStore uses klass always.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants