feat: Add ability to explain `groupNode` and it's attribute(s). #641

shahzadlone · 2022-07-17T15:46:40Z

Relevant issue(s)

Resolves #525

For Reviewer(s):

Should be easier to review commit by commit.
This PR completes and puts the lid on the simple @explain feature (with the exception of topLevelNode).
Planned to merge into v0.3.0 release.

Description

Makes groupNode explainable.
Explains the child selects list of attributes of groupNode.
Explains the attribute that represents the field the groupBy is on.
Includes integration tests for various types of groupNode explanations.

Demo

Request:

query @explain {
	author (
		groupBy: [age, verified],
	) {
		age
		_group(filter: {age: {_gt: 63}}) {
			name
		}
	}
}

Response:

{
  "data": [
    {
      "explain": {
        "selectTopNode": {
          "groupNode": {
            "groupByFields": [ "age", "verified" ],
            "childSelects": [
              {
                "collectionName": "author",
                "filter": {
                  "age": {
                    "_gt": 63
                  }
                },
                "docKeys": null,
                "groupBy": null,
                "limit": null,
                "orderBy": null
              }
            ],
            "selectNode": {
              "filter": null,
              "scanNode": {
                "collectionID": "3",
                "collectionName": "author",
                "filter": null,
                "spans": [
                  {
                    "end": "/4",
                    "start": "/3"
                  }
                ]
              }
            }
          }
        }
      }
    }
  ]
}

Limitations

Lacking tests for groupNode combined with count and sum aggregates as they would be similar to the graph created by the average groupBy case (but without the averageNode ofcourse).
We have some tests which will change once the bug Average query with subType field creates unnecessary typeIndexJoins under parallelNode #640 is fixed.
Dockey attribute that is in the child selection list is untested (I couldn't find a way to select dockey on child group).

Tasks

I made sure the code is well commented, particularly hard-to-understand areas.
I made sure the repository-held documentation is changed accordingly.
I made sure the pull request title adheres to the conventional commit style (the subset used in the project can be found in tools/configs/chglog/config.yml).
I made sure to discuss its limitations such as threats to validity, vulnerability to mistake and misuse, robustness to invalidation of assumptions, resource requirements, ...

How has this been tested?

Locally with unit tests + Altair + CI

Specify the platform(s) on which this was tested:

Arch Linux (specifically Manjaro flavor on WSL2)

codecov · 2022-07-17T15:52:18Z

Codecov Report

Merging #641 (08a13d1) into develop (a0332b7) will increase coverage by 0.18%.
The diff coverage is 87.27%.

@@             Coverage Diff             @@
##           develop     #641      +/-   ##
===========================================
+ Coverage    57.14%   57.32%   +0.18%     
===========================================
  Files          122      122              
  Lines        14662    14753      +91     
===========================================
+ Hits          8378     8457      +79     
- Misses        5567     5573       +6     
- Partials       717      723       +6

Impacted Files	Coverage Δ
query/graphql/mapper/targetable.go	`53.84% <ø> (ø)`
query/graphql/planner/explain.go	`68.96% <ø> (ø)`
query/graphql/planner/group.go	`83.51% <85.10%> (+2.46%)`	⬆️
query/graphql/mapper/mapper.go	`87.86% <100.00%> (+0.07%)`	⬆️
query/graphql/planner/arbitrary_join.go	`79.55% <100.00%> (ø)`

shahzadlone · 2022-07-22T08:52:32Z

tests/integration/query/explain/group_with_dockey_test.go

+							"childSelects": []dataMap{
+								{
+									"collectionName": "author",
+									"docKeys":        nil,


question: not quite sure if this is implemented, because I haven't been able to hit the dockey filter case inside the child group.

Can leave for now as we investigate

tests/integration/query/explain/group_test.go

jsimnz

Blocking issue/suggestion regarding the change to the GroupBy struct (int vs Field)`.

Will give huge praise for very expansive testing though!

tests/integration/query/explain/group_test.go

jsimnz · 2022-07-22T19:52:01Z

tests/integration/query/explain/group_with_average_test.go

+		Query: `query @explain {
+					author (groupBy: [name]) {
+						name
+						_avg(_group: {field: _avg})
+						_group(groupBy: [verified]) {
+							verified
+								_avg(_group: {field: age})
+						}
+					}
+				}`,


Now thats a complicated query 😅

Couldn't remotely tell you whats its trying to do in plain english

slowly piecing it together lol, fun test!

Groups everything by name, and shows average of average of the age groupedBy verified.

jsimnz · 2022-07-22T19:54:56Z

tests/integration/query/explain/group_with_dockey_test.go

+							"childSelects": []dataMap{
+								{
+									"collectionName": "author",
+									"docKeys":        nil,


Can leave for now as we investigate

jsimnz · 2022-07-22T20:00:55Z

query/graphql/mapper/targetable.go

-	FieldIndexes []int
+	Fields []Field


suggestion(blocking): I'm very hesitant to make a change like this without input from Andy, just for the sake of the explain system.

As I understand it, the goal here is to be able to efficiently get the corresponding FieldName when doing the GroupBy explain, but this change affects a lot of other places (as you know, since you had to update them all).

But, as I understand it, the mapper already has a utility to convert index into FieldName without needing to make a change like this.

eg: n.documentMapping.TryToFindNameFromIndex(index). Which you are already using for the order fields. Is it not possible to use this utility as well for the groupby fields? Which would mean you don't have to make this change?

Fair point John. I do think however that the codes reads a bit nicer with this change. For example, this:

for _, keyField := range keyFields {

reads better than this:

for _, keyField := range keyIndexes {

Fair point John. I do think however that the codes reads a bit nicer with this change. For example, this:

There's certainly more than a few rough edges w.r.t the mapper system, which are being tracked/tackled in #606.

The explain PRs should make an effort to not change core planner functionality, if it does need a refactor for something, it should be done in a seperate PR.

For this specific PR, as far as I can tall, the TryToFindNameFromIndex seems like it should be sufficient to circumvent this larger change.

It doesn't really change the functionality though. It merely adds information to a variable (from []int to []struct) and changes its name. I think of it as if it were already a struct, it would just be adding a struct field.

You make a fair point, however this is basically just adding an additional field (is nicer IMO). It's nice to have a guarantee of field index always having a corresponding name.

I wrote the TryToFindNameFromIndex function for finding indexes of ordering elements in orderNode. Using that function wouldn't guarantee that the field name exists (even though it should), and obviously is not as nice doing lookup if we don't have to.

One other thing I could do to reduce the changes (however would still prefer this approach better), is I could still pass in only the list of indexes into the functions whose signatures were changed to mapper.Field[] from int[].

LMK what you think, at the end of the day this is a very safe change as it's just tagging an additional field, and the previous field stays there as it was before. I would be concerned if I had removed a field haha

I'd still prefer to try to minimize changes beyond utility funcs for explain PRs.

Although technically this is a small change ([]int to []struct) as Fred pointed out, it does sprawl all over the implementation.

Its OK for now, but lets try to minimize this stuff in the future.

fredcarle

LGTM!

@Explain

…cenetwork#641) - Resolves sourcenetwork#525 - Description: Makes `groupNode` explainable. Explains the child selects list of attributes of `groupNode`. Explains the attribute that represents the field the `groupBy` is on. Includes integration tests for various types of `groupNode` explanations. - Request: ``` query @Explain { author ( groupBy: [age, verified], ) { age _group(filter: {age: {_gt: 63}}) { name } } } ``` - Response: ``` { "data": [ { "explain": { "selectTopNode": { "groupNode": { "groupByFields": [ "age", "verified" ], "childSelects": [ { "collectionName": "author", "filter": { "age": { "_gt": 63 } }, "docKeys": null, "groupBy": null, "limit": null, "orderBy": null } ], "selectNode": { "filter": null, "scanNode": { "collectionID": "3", "collectionName": "author", "filter": null, "spans": [ { "end": "/4", "start": "/3" } ] } } } } } } ] } ```

shahzadlone added feature New feature or request area/query Related to the query component action/no-benchmark Skips the action that runs the benchmark. labels Jul 17, 2022

shahzadlone added this to the DefraDB v0.3 milestone Jul 17, 2022

shahzadlone self-assigned this Jul 17, 2022

shahzadlone changed the title ~~Not ready to review (cleaning up local commits + testing).~~ feat: Add ability to explain groupNode attribute(s). Jul 18, 2022

shahzadlone force-pushed the lone/feat/explain-group-node-attributes branch 4 times, most recently from 6f3aabe to d021080 Compare July 22, 2022 08:49

shahzadlone commented Jul 22, 2022

View reviewed changes

tests/integration/query/explain/group_test.go Show resolved Hide resolved

shahzadlone marked this pull request as ready for review July 22, 2022 09:21

shahzadlone requested a review from a team July 22, 2022 09:21

jsimnz requested changes Jul 22, 2022

View reviewed changes

jsimnz approved these changes Jul 23, 2022

View reviewed changes

shahzadlone added 3 commits July 22, 2022 22:47

wip: Store the field key name with the index, i.e. use mapper.Field.

667bc3d

wip: Make groupNode explainable.

81462de

wip: Implement the explanation of the groupNode attribute(s).

dd80b97

shahzadlone force-pushed the lone/feat/explain-group-node-attributes branch from d021080 to 98a5d89 Compare July 23, 2022 02:48

wip: Add tests for groupBy explain cases.

08a13d1

shahzadlone force-pushed the lone/feat/explain-group-node-attributes branch from 98a5d89 to 08a13d1 Compare July 23, 2022 02:51

fredcarle approved these changes Jul 23, 2022

View reviewed changes

shahzadlone changed the title ~~feat: Add ability to explain groupNode attribute(s).~~ feat: Add ability to explain groupNode and it's attribute(s). Jul 23, 2022

shahzadlone merged commit 355ee34 into develop Jul 23, 2022

shahzadlone deleted the lone/feat/explain-group-node-attributes branch July 23, 2022 03:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add ability to explain `groupNode` and it's attribute(s). #641

feat: Add ability to explain `groupNode` and it's attribute(s). #641

shahzadlone commented Jul 17, 2022 •

edited

Loading

codecov bot commented Jul 17, 2022 •

edited

Loading

shahzadlone Jul 22, 2022

jsimnz Jul 22, 2022

jsimnz left a comment

jsimnz Jul 22, 2022

jsimnz Jul 22, 2022

jsimnz Jul 22, 2022 •

edited

Loading

shahzadlone Jul 22, 2022

jsimnz Jul 22, 2022

jsimnz Jul 22, 2022

fredcarle Jul 22, 2022

jsimnz Jul 22, 2022 •

edited

Loading

fredcarle Jul 22, 2022

shahzadlone Jul 22, 2022 •

edited

Loading

jsimnz Jul 23, 2022

fredcarle left a comment

feat: Add ability to explain groupNode and it's attribute(s). #641

feat: Add ability to explain groupNode and it's attribute(s). #641

Conversation

shahzadlone commented Jul 17, 2022 • edited Loading

Relevant issue(s)

For Reviewer(s):

Description

Demo

Limitations

Tasks

How has this been tested?

codecov bot commented Jul 17, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsimnz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsimnz Jul 22, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsimnz Jul 22, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shahzadlone Jul 22, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fredcarle left a comment

Choose a reason for hiding this comment

feat: Add ability to explain `groupNode` and it's attribute(s). #641

feat: Add ability to explain `groupNode` and it's attribute(s). #641

shahzadlone commented Jul 17, 2022 •

edited

Loading

codecov bot commented Jul 17, 2022 •

edited

Loading

jsimnz Jul 22, 2022 •

edited

Loading

jsimnz Jul 22, 2022 •

edited

Loading

shahzadlone Jul 22, 2022 •

edited

Loading