Fix inconsistencies in aggregations and query library functions. #5368

chipkent · 2024-04-16T15:42:38Z

Aggregation operations in query library functions and built-in query aggregations are inconsistent. This PR makes them consistent. Query library functions were changed.

percentile now returns the primitive type.
sum returns a widened type of double for floating point inputs or long for integer inputs.
product returns a widened type of double for floating point inputs or long for integer inputs.
cumsum returns a widened type of double[] for floating point inputs or long[] for integer inputs.
cumprod returns a widened type of double[] for floating point inputs or long[] for integer inputs.
wsum returns a widened type of long for all integer inputs and double for inputs containing floating points.

Note: Because the types have changed, the NULL return values have changed as well.

Resolves #4023

Initial changes to sum to avoid double conversions.

engine/function/src/templates/Numeric.ftl

rcaudy · 2024-04-16T19:07:54Z

engine/function/src/templates/Numeric.ftl

                if (isNaN(c)) {
-                    return ${pt.boxed}.NaN;
+                    return Double.NaN;


Should we handle infinities the way aggs do?

private double currentValueWithSum(long totalNormalCount, long totalNanCount, long totalPositiveInfinityCount, long totalNegativeInfinityCount, double newSum) { if (totalNanCount > 0 || (totalPositiveInfinityCount > 0 && totalNegativeInfinityCount > 0)) { return Double.NaN; } if (totalNegativeInfinityCount > 0) { return Double.NEGATIVE_INFINITY; } if (totalPositiveInfinityCount > 0) { return Double.POSITIVE_INFINITY; } if (totalNormalCount == 0) { return QueryConstants.NULL_DOUBLE; } return (double) newSum; }

They should give the same result, and the current impl has less branching. The current impl could be improved with a short circuit.

With the new short circuit code, we have:

if (isNaN(c) || isNaN(sum)) { return Double.NaN; }

I think this is as good as we can do. We can only short circuit on NaN. The behavior should be the same as the aggs.

Unit test added.

rcaudy · 2024-04-16T19:15:31Z

engine/function/src/templates/Numeric.ftl

+            return NULL_DOUBLE;
+        }
+
+        return hasZero ? 0 : prod;


I think we're missing a hasInf case here. Presumably we need that to be Double.NaN or the correct sided-infinity, I'm honestly not sure which is the most consistent.

Including the new short circuit logic, this is the code.

try ( final ${pt.vectorIterator} vi = values.iterator() ) { while ( vi.hasNext() ) { final ${pt.primitive} c = vi.${pt.iteratorNext}(); if (isNaN(c) || isNaN(prod)) { return Double.NaN; } else if (Double.isInfinite(c)) { if (hasZero) { return Double.NaN; } hasInf = true; } else if (c == 0) { if (hasInf) { return Double.NaN; } hasZero = true; } if (!isNull(c)) { count++; prod *= c; } } } if (count == 0) { return NULL_DOUBLE; } return hasZero ? 0 : prod;

Infinite values behave as expected.

So the code here should be behaving properly.

Unit test added.

rcaudy · 2024-04-16T19:26:33Z

engine/function/src/templates/Numeric.ftl


    /**
     * Returns the cumulative sum.  Null values are excluded.
     *
     * @param values values.
     * @return cumulative sum of non-null values.
     */
-    public static ${pt.primitive}[] cumsum(${pt.vector} values) {
+    <#if pt.valueType.isFloat >
+    public static double[] cumsum(${pt.vector} values) {


We probably need the same kind of infinity handling as in sum.

As in the other cases, the code should be handling infinity fine. Better short-circuiting was added.

Unit test added.

engine/function/src/templates/Numeric.ftl

rcaudy · 2024-04-16T20:02:14Z

engine/function/src/templates/Numeric.ftl

+                if (isNaN(v)) {
+                    Arrays.fill(result, i, n, Double.NaN);
+                    return result;
+                } else if (isNull(result[i - 1])) {


Maybe missing infinity handling?

Better short-circuiting was added. Infinity handling should be fine.

Unit test added.

engine/function/src/templates/Numeric.ftl

rcaudy · 2024-04-16T20:12:37Z

engine/function/src/templates/Numeric.ftl

+                    vsum += (double) c * w;
+                   <#else>
+                    vsum += c * (double) w;
+                   </#if>
                }
            }
        }

        return vsum;


Do we need some infinity handling here?

Better short-circuiting was added. Infinity handling should be fine.

Unit test added.

…ded casts.

engine/function/src/templates/Numeric.ftl

deephaven-internal · 2024-05-06T17:50:04Z

Labels indicate documentation is required. Issues for documentation have been opened:

Community: deephaven/deephaven-docs-community#206

chipkent added 6 commits April 15, 2024 22:07

Changed percentile return type to the primitive type.

c38bb13

Initial changes to sum to avoid double conversions.

Widen sum return type.

b0f33e0

Widen product return type.

61527e2

Widen cumsum return type.

bc196b4

Widen cumprod return type.

c8d91c3

Widen wsum return type.

73f048e

chipkent added query engine DocumentationNeeded breaking ReleaseNotesNeeded Release notes are needed labels Apr 16, 2024

chipkent added this to the 1. March 2024 milestone Apr 16, 2024

chipkent requested review from lbooker42, kosak and rcaudy April 16, 2024 15:42

Regenerated groovy static imports.

9e4f74f

lbooker42 reviewed Apr 16, 2024

View reviewed changes

engine/function/src/templates/Numeric.ftl Outdated Show resolved Hide resolved

engine/function/src/templates/Numeric.ftl Show resolved Hide resolved

engine/function/src/templates/Numeric.ftl Show resolved Hide resolved

engine/function/src/templates/Numeric.ftl Outdated Show resolved Hide resolved

chipkent added 3 commits April 16, 2024 10:18

Fixed unit tests.

4a4863d

Addressed review comments.

fc67adc

Addressed review comments.

6fda5ad

chipkent requested a review from lbooker42 April 16, 2024 16:57

lbooker42 previously approved these changes Apr 16, 2024

View reviewed changes

lbooker42 mentioned this pull request Apr 16, 2024

Widen returned types for UpdateBy floating point operations. #5371

Merged

rcaudy reviewed Apr 16, 2024

View reviewed changes

kosak previously approved these changes Apr 16, 2024

View reviewed changes

Addressed review comments. Added more short circuits and remove unnee…

aea35f6

…ded casts.

chipkent dismissed stale reviews from kosak and lbooker42 via aea35f6 April 16, 2024 20:33

Addressed review comments. Added more short circuits.

020d3f0

kosak reviewed Apr 16, 2024

View reviewed changes

engine/function/src/templates/Numeric.ftl Show resolved Hide resolved

engine/function/src/templates/Numeric.ftl Show resolved Hide resolved

engine/function/src/templates/Numeric.ftl Outdated Show resolved Hide resolved

Addressed review comments. More unit tests.

f7d682d

kosak reviewed Apr 16, 2024

View reviewed changes

engine/function/src/templates/Numeric.ftl Outdated Show resolved Hide resolved

chipkent requested review from rcaudy, lbooker42 and kosak April 16, 2024 21:32

Addressed review comments.

81e449b

rcaudy approved these changes Apr 16, 2024

View reviewed changes

kosak approved these changes Apr 17, 2024

View reviewed changes

chipkent added the devrel-watch DevRel team is watching label Apr 17, 2024

chipkent merged commit 3020365 into deephaven:main May 6, 2024
17 checks passed

chipkent deleted the 4023_agg_inconsitencies branch May 6, 2024 17:49

github-actions bot locked and limited conversation to collaborators May 6, 2024

chipkent removed the devrel-watch DevRel team is watching label Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix inconsistencies in aggregations and query library functions. #5368

Fix inconsistencies in aggregations and query library functions. #5368

chipkent commented Apr 16, 2024 •

edited

Loading

rcaudy Apr 16, 2024

chipkent Apr 16, 2024

chipkent Apr 16, 2024

chipkent Apr 16, 2024

rcaudy Apr 16, 2024

chipkent Apr 16, 2024

chipkent Apr 16, 2024

rcaudy Apr 16, 2024

chipkent Apr 16, 2024 •

edited

Loading

chipkent Apr 16, 2024

rcaudy Apr 16, 2024

chipkent Apr 16, 2024

chipkent Apr 16, 2024

rcaudy Apr 16, 2024

chipkent Apr 16, 2024

chipkent Apr 16, 2024

deephaven-internal commented May 6, 2024

Fix inconsistencies in aggregations and query library functions. #5368

Fix inconsistencies in aggregations and query library functions. #5368

Conversation

chipkent commented Apr 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chipkent Apr 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deephaven-internal commented May 6, 2024

chipkent commented Apr 16, 2024 •

edited

Loading

chipkent Apr 16, 2024 •

edited

Loading