Skip to content

Commit

Permalink
Merge branch 'main' into string_containment_functions
Browse files Browse the repository at this point in the history
  • Loading branch information
jacques-n authored Jul 25, 2022
2 parents f946374 + f7c5da5 commit cac2771
Show file tree
Hide file tree
Showing 9 changed files with 306 additions and 16 deletions.
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,17 @@
Release Notes
---

## [0.8.0](https://github.com/substrait-io/substrait/compare/v0.7.0...v0.8.0) (2022-07-17)


### ⚠ BREAKING CHANGES

* The signature of divide functions for multiple types now specify an enumeration prior to specifying operands.

### Bug Fixes

* add overflow behavior to integer division ([#223](https://github.com/substrait-io/substrait/issues/223)) ([cf552d7](https://github.com/substrait-io/substrait/commit/cf552d7c76da9a91bce992391356c6ffb5a969ac))

## [0.7.0](https://github.com/substrait-io/substrait/compare/v0.6.0...v0.7.0) (2022-07-11)


Expand Down
137 changes: 136 additions & 1 deletion extensions/functions_arithmetic.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,36 @@ scalar_functions:
- value: fp64
- value: fp64
return: fp64
-
name: "negate"
description: "Negation of the value"
impls:
- args:
- options: [ SILENT, SATURATE, ERROR ]
required: false
- value: i8
return: i8
- args:
- options: [ SILENT, SATURATE, ERROR ]
required: false
- value: i16
return: i16
- args:
- options: [ SILENT, SATURATE, ERROR ]
required: false
- value: i32
return: i32
- args:
- options: [ SILENT, SATURATE, ERROR ]
required: false
- value: i64
return: i64
- args:
- value: fp32
return: fp32
- args:
- value: fp64
return: fp64
-
name: "modulus"
description: "Get the remainder when dividing one value by another."
Expand All @@ -181,9 +211,58 @@ scalar_functions:
- value: i64
- value: i64
return: i64
-
name: "power"
description: "Take the power with the first value as the base and second as exponent."
impls:
- args:
- options: [ SILENT, SATURATE, ERROR ]
required: false
- value: i64
- value: i64
return: i64
- args:
- value: fp32
- value: fp32
return: fp32
- args:
- value: fp64
- value: fp64
return: fp64
-
name: "sqrt"
description: "Square root of the value"
impls:
- args:
- name: rounding
options: [ TIE_TO_EVEN, TIE_AWAY_FROM_ZERO, TRUNCATE, CEILING, FLOOR ]
required: false
- name: on_domain_error
options: [ NAN, ERROR ]
required: false
- value: i64
return: fp64
- args:
- name: rounding
options: [ TIE_TO_EVEN, TIE_AWAY_FROM_ZERO, TRUNCATE, CEILING, FLOOR ]
required: false
- name: on_domain_error
options: [ NAN, ERROR ]
required: false
- value: fp32
return: fp32
- args:
- name: rounding
options: [ TIE_TO_EVEN, TIE_AWAY_FROM_ZERO, TRUNCATE, CEILING, FLOOR ]
required: false
- name: on_domain_error
options: [ NAN, ERROR ]
required: false
- value: fp64
return: fp64
aggregate_functions:
- name: "sum"
description: Sum a set of values.
description: Sum a set of values. The sum of zero elements yields null.
impls:
- args:
- options: [ SILENT, SATURATE, ERROR ]
Expand Down Expand Up @@ -362,3 +441,59 @@ aggregate_functions:
decomposable: MANY
intermediate: fp64?
return: fp64?
window_functions:
- name: "row_number"
description: "the number of the current row within its partition."
impls:
- args: []
nullability: DECLARED_OUTPUT
decomposable: NONE
return: i64?
window_type: PARTITION
- name: "rank"
description: "the rank of the current row, with gaps."
impls:
- args: []
nullability: DECLARED_OUTPUT
decomposable: NONE
return: i64?
window_type: PARTITION
- name: "dense_rank"
description: "the rank of the current row, without gaps."
impls:
- args: []
nullability: DECLARED_OUTPUT
decomposable: NONE
return: i64?
window_type: PARTITION
- name: "percent_rank"
description: "the relative rank of the current row."
impls:
- args: []
nullability: DECLARED_OUTPUT
decomposable: NONE
return: fp64?
window_type: PARTITION
- name: "cume_dist"
description: "the cumulative distribution."
impls:
- args: []
nullability: DECLARED_OUTPUT
decomposable: NONE
return: fp64?
window_type: PARTITION
- name: "ntile"
description: "Return an integer ranging from 1 to the argument value,dividing the partition as equally as possible."
impls:
- args:
- value: i32
nullability: DECLARED_OUTPUT
decomposable: NONE
return: i32?
window_type: PARTITION
- args:
- value: i64
nullability: DECLARED_OUTPUT
decomposable: NONE
return: i64?
window_type: PARTITION
11 changes: 0 additions & 11 deletions extensions/functions_cast.yaml

This file was deleted.

10 changes: 10 additions & 0 deletions extensions/functions_comparison.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -73,4 +73,14 @@ scalar_functions:
- value: any1
return: BOOLEAN
nullability: DECLARED_OUTPUT
-
name: "is_nan"
description: Whether a value is not a number.
impls:
- args:
- value: fp32
return: BOOLEAN
- args:
- value: fp64
return: BOOLEAN

106 changes: 106 additions & 0 deletions extensions/functions_logarithmic.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
%YAML 1.2
---
scalar_functions:
-
name: "ln"
description: "Natural logarithm of the value"
impls:
- args:
- name: rounding
options: [ TIE_TO_EVEN, TIE_AWAY_FROM_ZERO, TRUNCATE, CEILING, FLOOR ]
required: false
- name: on_domain_error
options: [ NAN, ERROR ]
required: false
- value: fp32
return: fp32
- args:
- name: rounding
options: [ TIE_TO_EVEN, TIE_AWAY_FROM_ZERO, TRUNCATE, CEILING, FLOOR ]
required: false
- name: on_domain_error
options: [ NAN, ERROR ]
required: false
- value: fp64
return: fp64
-
name: "log10"
description: "Logarithm to base 10 of the value"
impls:
- args:
- name: rounding
options: [ TIE_TO_EVEN, TIE_AWAY_FROM_ZERO, TRUNCATE, CEILING, FLOOR ]
required: false
- name: on_domain_error
options: [ NAN, ERROR ]
required: false
- value: fp32
return: fp32
- args:
- name: rounding
options: [ TIE_TO_EVEN, TIE_AWAY_FROM_ZERO, TRUNCATE, CEILING, FLOOR ]
required: false
- name: on_domain_error
options: [ NAN, ERROR ]
required: false
- value: fp64
return: fp64
-
name: "log2"
description: "Logarithm to base 2 of the value"
impls:
- args:
- name: rounding
options: [ TIE_TO_EVEN, TIE_AWAY_FROM_ZERO, TRUNCATE, CEILING, FLOOR ]
required: false
- name: on_domain_error
options: [ NAN, ERROR ]
required: false
- value: fp32
return: fp32
- args:
- name: rounding
options: [ TIE_TO_EVEN, TIE_AWAY_FROM_ZERO, TRUNCATE, CEILING, FLOOR ]
required: false
- name: on_domain_error
options: [ NAN, ERROR ]
required: false
- value: fp64
return: fp64
-
name: "logb"
description: >
Logarithm of the value with the given base
logb(x, b) => log_{b} (x)
impls:
- args:
- name: rounding
options: [ TIE_TO_EVEN, TIE_AWAY_FROM_ZERO, TRUNCATE, CEILING, FLOOR ]
required: false
- name: on_domain_error
options: [ NAN, ERROR ]
required: false
- value: fp32
name: "x"
description: "The number `x` to compute the logarithm of"
- value: fp32
name: "base"
description: "The logarithm base `b` to use"
return: fp32
- args:
- name: rounding
options: [ TIE_TO_EVEN, TIE_AWAY_FROM_ZERO, TRUNCATE, CEILING, FLOOR ]
required: false
- name: on_domain_error
options: [ NAN, ERROR ]
required: false
- value: fp64
name: "x"
description: "The number `x` to compute the logarithm of"
- value: fp64
name: "base"
description: "The logarithm base `b` to use"
return: fp64


23 changes: 23 additions & 0 deletions extensions/functions_rounding.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
%YAML 1.2
---
scalar_functions:
-
name: "ceil"
description: "Rounding to the ceiling of the value"
impls:
- args:
- value: fp32
return: fp32
- args:
- value: fp64
return: fp64
-
name: "floor"
description: "Rounding to the floor of the value"
impls:
- args:
- value: fp32
return: fp32
- args:
- value: fp64
return: fp64
3 changes: 2 additions & 1 deletion extensions/functions_string.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@ scalar_functions:
- value: i32
- value: i32
return: "string"
- name: starts_with
-
name: starts_with
description: Whether this string starts with another string.
impls:
- args:
Expand Down
14 changes: 12 additions & 2 deletions site/docs/relations/logical_relations.md
Original file line number Diff line number Diff line change
Expand Up @@ -282,15 +282,25 @@ The aggregate operation groups input data on one or more sets of grouping keys,
| Inputs | 1 |
| Outputs | 1 |
| Property Maintenance | Maintains distribution if all distribution fields are contained in every grouping set. No orderedness guaranteed. |
| Direct Output Order | The list of distinct columns from each grouping set (ordered by their first appearance) followed by the list of measures in declaration order, followed by an integer describing the associated particular grouping set the value is derived from. |
| Direct Output Order | The list of distinct columns from each grouping set (ordered by their first appearance) followed by the list of measures in declaration order, followed by an `i32` describing the associated particular grouping set the value is derived from (if applicable). |

In its simplest form, an aggregation has only measures. In this case, all records are folded into one, and a column is returned for each aggregate expression in the measures list.

Grouping sets can be used for finer-grained control over which records are folded. Within a grouping set, two records will be folded together if and only if each expressions in the grouping set yields the same value for each. The values returned by the grouping sets will be returned as columns to the left of the columns for the aggregate expressions. If a grouping set contains no grouping expressions, all rows will be folded for that grouping set.

It's possible to specify multiple grouping sets in a single aggregate operation. The grouping sets behave more or less independently, with each returned record belonging to one of the grouping sets. The values for the grouping expression columns that are not part of the grouping set for a particular record will be set to null. Two grouping expressions will be returned using the same column if they represent the protobuf messages describing the expressions are equal. The columns for grouping expressions that do *not* appear in *all* grouping sets will be nullable (regardless of the nullability of the type returned by the grouping expression) to accomodate the null insertion.

To further disambiguate which record belongs to which grouping set, an aggregate relation with more than one grouping set receives an extra `i32` column on the right-hand side. The value of this field will be the zero-based index of the grouping set that yielded the record.

If at least one grouping expression is present, the aggregation is allowed to not have any aggregate expressions. An aggregate relation is invalid if it would yield zero columns.

### Aggregate Properties

| Property | Description | Required |
| ---------------- | ------------------------------------------------------------ | --------------------------------------- |
| Input | The relational input. | Required |
| Grouping Sets | One or more grouping sets. | Optional, required if no measures. |
| Per Grouping Set | A list of expression grouping that the aggregation measured should be calculated for. | Optional, defaults to 0. |
| Per Grouping Set | A list of expression grouping that the aggregation measured should be calculated for. | Optional. |
| Measures | A list of one or more aggregate expressions along with an optional filter. | Optional, required if no grouping sets. |


Expand Down
7 changes: 6 additions & 1 deletion text/simple_extensions_schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@ properties:
type: array
items:
$ref: "#/$defs/aggregateFunction"
window_functions:
type: array
items:
$ref: "#/$defs/windowFunction"

$defs:
type:
Expand Down Expand Up @@ -220,6 +224,7 @@ $defs:
$ref: "#/$defs/maxset"
decomposable:
$ref: "#/$defs/decomposable"

windowFunction:
type: object
additionalProperties: false
Expand All @@ -235,7 +240,7 @@ $defs:
items:
type: object
additionalProperties: false
required: [ intermediate, return ]
required: [ return ]
properties:
args:
$ref: "#/$defs/arguments"
Expand Down

0 comments on commit cac2771

Please sign in to comment.