From 795a9452b9189de435fa19bfc9b4b3ffa2210b5b Mon Sep 17 00:00:00 2001 From: Marios Trivyzas Date: Fri, 3 May 2019 13:05:33 +0300 Subject: [PATCH 1/3] SQL: [Docs] Add example for custom bucketing with CASE Add a TIP on how to use CASE to achieve custom bucketing with GROUP BY. Follows: #41349 --- .../sql/functions/conditional.asciidoc | 25 +++++++++++++++++++ .../language/syntax/commands/select.asciidoc | 4 +++ 2 files changed, 29 insertions(+) diff --git a/docs/reference/sql/functions/conditional.asciidoc b/docs/reference/sql/functions/conditional.asciidoc index cf15504bbe379..dd87ee3d54250 100644 --- a/docs/reference/sql/functions/conditional.asciidoc +++ b/docs/reference/sql/functions/conditional.asciidoc @@ -98,6 +98,31 @@ an error message would be returned, mentioning that *'foo'* is of data type *key which does not match the expected data type *integer* (based on result *10*). =============================== +[[sql-functions-conditional-case-groupby-custom-buckets]] +[TIP] +=============================== +CASE can be used as a GROUP BY key in a query to facilitate custom bucketing +and assign descriptive names to those buckets. If for example the number of +values for a key are too many or, simply, ranges of those values are more +interesting than every single value, CASE can create custom buckets as in the +following example: + +[source, sql] +SELECT count(*) AS count, + CASE WHEN NVL(languages, 0) = 0 THEN 'zero' + WHEN languages = 1 THEN 'one' + WHEN languages = 2 THEN 'bilingual' + WHEN languages = 3 THEN 'trilingual' + ELSE 'multilingual' + END as lang_skills +FROM employees +GROUP BY lang_skills +ORDER BY lang_skills; + +With this query, we can create normal grouping buckets for values _0, 1, 2, 3_ with +descriptive names, and every value _>= 4_ falls into the _multilingual_ bucket. +=============================== + [[sql-functions-conditional-coalesce]] ==== `COALESCE` diff --git a/docs/reference/sql/language/syntax/commands/select.asciidoc b/docs/reference/sql/language/syntax/commands/select.asciidoc index 26fdb2f337ebc..08ebe0ae96497 100644 --- a/docs/reference/sql/language/syntax/commands/select.asciidoc +++ b/docs/reference/sql/language/syntax/commands/select.asciidoc @@ -204,6 +204,10 @@ Multiple aggregates used: include-tagged::{sql-specs}/docs/docs.csv-spec[groupByAndMultipleAggs] ---- +[TIP] +If custom bucketing is required, it can be achieved with the use of `<>`, +as shown <>. + [[sql-syntax-group-by-implicit]] ===== Implicit Grouping From 6fee0e60c25a007bb9c8060a729f76056bb85b76 Mon Sep 17 00:00:00 2001 From: Marios Trivyzas Date: Mon, 6 May 2019 13:23:32 +0300 Subject: [PATCH 2/3] address comments --- docs/reference/sql/functions/conditional.asciidoc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/reference/sql/functions/conditional.asciidoc b/docs/reference/sql/functions/conditional.asciidoc index dd87ee3d54250..3a024d7a88aae 100644 --- a/docs/reference/sql/functions/conditional.asciidoc +++ b/docs/reference/sql/functions/conditional.asciidoc @@ -102,8 +102,8 @@ which does not match the expected data type *integer* (based on result *10*). [TIP] =============================== CASE can be used as a GROUP BY key in a query to facilitate custom bucketing -and assign descriptive names to those buckets. If for example the number of -values for a key are too many or, simply, ranges of those values are more +and assign descriptive names to those buckets. If, for example, the values +for a key are too many or, simply, ranges of those values are more interesting than every single value, CASE can create custom buckets as in the following example: @@ -119,7 +119,7 @@ FROM employees GROUP BY lang_skills ORDER BY lang_skills; -With this query, we can create normal grouping buckets for values _0, 1, 2, 3_ with +With this query, one can create normal grouping buckets for values _0, 1, 2, 3_ with descriptive names, and every value _>= 4_ falls into the _multilingual_ bucket. =============================== From 2e02a9e84e9cef373141ec8ab82cb0dd72c23824 Mon Sep 17 00:00:00 2001 From: Marios Trivyzas Date: Mon, 6 May 2019 17:31:35 +0300 Subject: [PATCH 3/3] address comment --- docs/reference/sql/functions/conditional.asciidoc | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/docs/reference/sql/functions/conditional.asciidoc b/docs/reference/sql/functions/conditional.asciidoc index 3a024d7a88aae..28703d9141b0c 100644 --- a/docs/reference/sql/functions/conditional.asciidoc +++ b/docs/reference/sql/functions/conditional.asciidoc @@ -99,8 +99,8 @@ which does not match the expected data type *integer* (based on result *10*). =============================== [[sql-functions-conditional-case-groupby-custom-buckets]] -[TIP] -=============================== +===== Conditional bucketing + CASE can be used as a GROUP BY key in a query to facilitate custom bucketing and assign descriptive names to those buckets. If, for example, the values for a key are too many or, simply, ranges of those values are more @@ -121,7 +121,6 @@ ORDER BY lang_skills; With this query, one can create normal grouping buckets for values _0, 1, 2, 3_ with descriptive names, and every value _>= 4_ falls into the _multilingual_ bucket. -=============================== [[sql-functions-conditional-coalesce]] ==== `COALESCE`