Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow a Filter to be used as the Expression in a SelectColumn. #6365

Merged
merged 22 commits into from
Nov 22, 2024

Conversation

cpwright
Copy link
Contributor

No description provided.

@cpwright cpwright requested a review from devinrsmith November 11, 2024 21:42
@cpwright cpwright changed the title Allow a Filter to be used as the Expression in a SelectColumn. feat: Allow a Filter to be used as the Expression in a SelectColumn. Nov 13, 2024
@cpwright
Copy link
Contributor Author

Fix #6364. (The engine side, does not cover the gRPC side which requires wider consensus.)

@cpwright cpwright added DocumentationNeeded NoReleaseNotesNeeded No release notes are needed. labels Nov 13, 2024
@cpwright cpwright marked this pull request as ready for review November 13, 2024 16:34
@devinrsmith devinrsmith added this to the 0.37.0 milestone Nov 14, 2024
Copy link
Member

@devinrsmith devinrsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @rcaudy for final pass.

@devinrsmith devinrsmith requested a review from rcaudy November 14, 2024 19:31
Comment on lines 252 to 267
final RowSet.Iterator inputIt = inputRowSet.iterator();
final RowSet.Iterator trueIt = filtered.iterator()) {
long nextTrue = trueIt.hasNext() ? trueIt.nextLong() : -1;
int offset = 0;
while (nextTrue >= 0) {
// the input iterator is a superset of the true iterator, so we can always find out what
// the next value is without needing to check hasNext
final long nextInput = inputIt.nextLong();
final boolean found = nextInput == nextTrue;
booleanDestination.set(offset++, found);
if (found) {
nextTrue = trueIt.hasNext() ? trueIt.nextLong() : -1;
}
}
// fill everything else up with false, because nothing else can match
booleanDestination.fillWithBoxedValue(offset, booleanDestination.size() - offset, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer bulk advance?

Suggested change
final RowSet.Iterator inputIt = inputRowSet.iterator();
final RowSet.Iterator trueIt = filtered.iterator()) {
long nextTrue = trueIt.hasNext() ? trueIt.nextLong() : -1;
int offset = 0;
while (nextTrue >= 0) {
// the input iterator is a superset of the true iterator, so we can always find out what
// the next value is without needing to check hasNext
final long nextInput = inputIt.nextLong();
final boolean found = nextInput == nextTrue;
booleanDestination.set(offset++, found);
if (found) {
nextTrue = trueIt.hasNext() ? trueIt.nextLong() : -1;
}
}
// fill everything else up with false, because nothing else can match
booleanDestination.fillWithBoxedValue(offset, booleanDestination.size() - offset, false);
final RowSequence.Iterator inputRows = inputRowSet.getRowSequenceIterator();
final RowSet.Iterator trueRows = filtered.iterator()) {
int offset = 0;
while (trueRows.hasNext()) {
final long nextTrue = trueRows.nextLong();
// Find all the false rows between the last consumed input row and the next true row
final int falsesSkipped = (int) inputRows.advanceAndGetPositionDistance(nextTrue + 1) - 1;
if (falsesSkipped > 0) {
booleanDestination.fillWithValue(offset, falsesSkipped, false);
offset += falsesSkipped;
}
booleanDestination.set(offset++, true);
}
final int remainingFalses = booleanDestination.size() - offset;
// Fill everything else up with false, because we've exhausted the trues
if (remainingFalses > 0) {
booleanDestination.fillWithValue(offset, remainingFalses, false);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original code (density is the fraction of true values, uniformly distributed):

Density: 1.0000, Nanos: 11560028500, per cell=11.56
Density: 0.9999, Nanos: 13277620834, per cell=13.28
Density: 0.9990, Nanos: 14978527958, per cell=14.98
Density: 0.8725, Nanos: 21772909750, per cell=21.77
Density: 0.7500, Nanos: 28369796167, per cell=28.37
Density: 0.5000, Nanos: 36947907333, per cell=36.95
Density: 0.2500, Nanos: 24845209958, per cell=24.85
Density: 0.1250, Nanos: 17964264291, per cell=17.96
Density: 0.0010, Nanos: 11357967500, per cell=11.36
Density: 0.0001, Nanos: 10771053167, per cell=10.77

New code:

Density: 1.0000, Nanos: 11589953167, per cell=11.59
Density: 0.9999, Nanos: 13430658834, per cell=13.43
Density: 0.9990, Nanos: 15374023083, per cell=15.37
Density: 0.8725, Nanos: 22405263417, per cell=22.41
Density: 0.7500, Nanos: 28906296084, per cell=28.91
Density: 0.5000, Nanos: 36402417875, per cell=36.40
Density: 0.2500, Nanos: 24535559166, per cell=24.54
Density: 0.1250, Nanos: 17518296583, per cell=17.52
Density: 0.0010, Nanos: 10686682416, per cell=10.69
Density: 0.0001, Nanos: 10635898708, per cell=10.64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can swap in which ever implementation you prefer at this point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I like the new code better, but there's no strong case for either.

Copy link
Member

@rcaudy rcaudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@@ -107,6 +113,18 @@ private List<String> checkForInvalidFilters() {
"Cannot use a filter with virtual row variables (i, ii, or k) in select, view, update, or updateView: "
+ filter);
}
if (filter.isRefreshing()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this means that when we gather filters that are dependencies in QueryTable or WouldMatch, we should only consider refreshing filters.

@cpwright cpwright enabled auto-merge (squash) November 22, 2024 21:12
@cpwright cpwright merged commit 3a547af into deephaven:main Nov 22, 2024
17 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Nov 22, 2024
@deephaven-internal
Copy link
Contributor

Labels indicate documentation is required. Issues for documentation have been opened:

Community: deephaven/deephaven-docs-community#366

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants