From fafba308be7af75d7bac463f581780e753ea92c8 Mon Sep 17 00:00:00 2001
From: Andrei Matei <andrei@cockroachlabs.com>
Date: Wed, 13 Apr 2016 17:06:16 -0400
Subject: [PATCH 1/2] list the types of aggregators

---
 docs/RFCS/distributed_sql.md | 46 ++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/docs/RFCS/distributed_sql.md b/docs/RFCS/distributed_sql.md
index 5a7545193e93..d4a272b18e71 100644
--- a/docs/RFCS/distributed_sql.md
+++ b/docs/RFCS/distributed_sql.md
@@ -17,6 +17,7 @@
       * [Example 1](#example-1)
       * [Example 2](#example-2)
       * [Example 3](#example-3)
+    * [Types of aggregators](#types-of-aggregators)
     * [From logical to physical](#from-logical-to-physical)
       * [Processors](#processors)
     * [Joins](#joins)
@@ -431,6 +432,51 @@ AGGREGATOR final:
 Composition: src -> countdistinctmin -> final
 ```
 
+### Types of aggregators
+
+- `TABLE READER` is a special agregator, with no input stream. It's configured
+  with spans of a table or index and the schema that it needs to read.
+  Like every other aggregator, it can be configured with a programmable output
+  filter.
+- `PROGRAM` is a fully programmable no-grouping aggregator. It runs a "program"
+  on each individual row. The program can drop the row, or modify it
+  arbitrarily.
+- `JOIN` performs a join on two streams, with equality constraints between
+  certain columns. The aggregator is grouped on the columns that are
+  constrained to be equal. See [Stream joins](#stream-joins).
+- `JOIN READER` performs point-lookups for rows with the kyes indicated by the
+  input stream. It can do so by performing (potentially remote) KV reads, or by
+  setting up remote flows. See [Join-by-lookup](#join-by-lookup) and
+  [On-the-fly flows setup](#on-the-fly-flows-setup).
+- `MUTATE` performs insertions/deletions/updates to KV. See section TODO.
+- `SET OPERATION` takes several inputs and performs set arithmetic on them
+  (union, difference).
+- `AGGREGATOR` is the one that does "aggregation" in the SQL sense. It groups
+  rows and computes an aggregate for each group. The group is configured using
+  the group key. `AGGREGATOR` can be configured with one or more aggregation
+  functions:
+  - `SUM`
+  - `COUNT`
+  - `COUNT DISTINCT`
+  `AGGREGATOR`'s output schema consists of the group key, plus a configurable
+  subset of the the generated aggregated values. The optional output filter has
+  access to the group key and all the aggregagated values (i.e. it can use even
+  values that are not ultimately outputted).
+- `SORT` sorts the input according to a configurable set of columns. Note that
+  this is a no-grouping aggregator, hence it can be distributed arbitrarily to
+  the data producers. This means, of course, that it doesn't produce a global
+  ordering, instead it just guarantees an intra-stream ordering on each
+  physical output streams). The global ordering, when needed, is achieved by an
+  input synchronizer of a grouped processor (such as `LIMIT` or `FINAL`).
+- `LIMIT` is a single-group aggregator that stops after reading so many input
+  rows.
+- `INTENT-COLLECTOR` is a single-group aggregator, scheduled on the gateway,
+  that receives all the intents generated by a `MUTATE` and keeps track of them
+  in memory until the transaction is committed.
+- `FINAL` is a single-group aggregator, scheduled on the gateway, that collects
+  the results of the query. This aggregator will be hooked up to the pgwire
+  connection to the client.
+
 ## From logical to physical
 
 To distribute the computation that was described in terms of aggregators and

From 1468c415982428cc6b62180999c8aebd268951df Mon Sep 17 00:00:00 2001
From: Andrei Matei <andrei@cockroachlabs.com>
Date: Wed, 13 Apr 2016 17:32:07 -0400
Subject: [PATCH 2/2] add DISTINCT as an aggregation function

---
 docs/RFCS/distributed_sql.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/RFCS/distributed_sql.md b/docs/RFCS/distributed_sql.md
index d4a272b18e71..dbfd8ca99734 100644
--- a/docs/RFCS/distributed_sql.md
+++ b/docs/RFCS/distributed_sql.md
@@ -458,6 +458,7 @@ Composition: src -> countdistinctmin -> final
   - `SUM`
   - `COUNT`
   - `COUNT DISTINCT`
+  - `DISTINCT`
   `AGGREGATOR`'s output schema consists of the group key, plus a configurable
   subset of the the generated aggregated values. The optional output filter has
   access to the group key and all the aggregagated values (i.e. it can use even