Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the filtering for external tables in the Redshift get_columns_in_relation macro #2855

Merged

Conversation

brangisom
Copy link
Contributor

@brangisom brangisom commented Oct 27, 2020

resolves #2854

Description

This change will ensure that the table_schema = '{{ relation.schema}}' filter gets pushed down directly against the svv_external_columns table to ensure that this macro does not cause performance issues by calling far more API calls than necessary.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

dbt run output

0.19.0-b1

  • Before
➜  invision-dbt git:(for-dbt-pr) ✗ dbt run --models test_model --no-version-check
Running with dbt=0.19.0-b1
Found 651 models, 299 tests, 6 snapshots, 0 analyses, 331 macros, 1 operation, 24 seed files, 326 sources

18:42:46 | Concurrency: 1 threads (target='dev')
18:42:46 |
18:42:46 | 1 of 1 START incremental model analytics_dev_brandon.test_model...... [RUN]
18:54:50 | 1 of 1 OK created incremental model analytics_dev_brandon.test_model. [INSERT 0 7823 in 724.02s]
18:54:51 |
18:54:51 | Running 1 on-run-end hook
18:54:51 | 1 of 1 START hook: invision.on-run-end.0............................. [RUN]
18:54:52 | 1 of 1 OK hook: invision.on-run-end.0................................ [GRANT in 1.40s]
18:54:52 |
18:54:52 |
18:54:52 | Finished running 1 incremental model, 1 hook in 743.35s.

Completed successfully

Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
  • After
(.venv) ➜  invision-dbt git:(for-dbt-pr) ✗ dbt run --models test_model --no-version-check
Running with dbt=0.19.0-b1
Found 651 models, 299 tests, 6 snapshots, 0 analyses, 331 macros, 1 operation, 24 seed files, 326 sources

19:37:51 | Concurrency: 1 threads (target='dev')
19:37:51 |
19:37:51 | 1 of 1 START incremental model analytics_dev_brandon.test_model...... [RUN]
19:38:03 | 1 of 1 OK created incremental model analytics_dev_brandon.test_model. [INSERT 0 7823 in 11.93s]
19:38:03 |
19:38:03 | Running 1 on-run-end hook
19:38:04 | 1 of 1 START hook: invision.on-run-end.0............................. [RUN]
19:38:05 | 1 of 1 OK hook: invision.on-run-end.0................................ [GRANT in 1.43s]
19:38:05 |
19:38:05 |
19:38:05 | Finished running 1 incremental model, 1 hook in 31.51s.

Completed successfully

Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

Explain output

  • Before
XN Merge  (cost=999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=1 width=176)
  Merge Key: ordinal_position
  ->  XN Network  (cost=999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=1 width=176)
        Send to leader
        ->  XN Sort  (cost=999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=1 width=176)
              Sort Key: ordinal_position
              ->  XN Subquery Scan unioned  (cost=145262411.58..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=1 width=176)
                    Filter: (table_schema = 'some'::name)
                    ->  XN Append  (cost=145262411.58..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=119 width=634)
                          ->  XN Subquery Scan "*SELECT* 2"  (cost=0.00..15.31 rows=5 width=292)
                          ->  XN Subquery Scan "*SELECT* 1"  (cost=145262411.58..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=109 width=599)
                          ->  XN Network  (cost=1000000000020.18..1000000000020.58 rows=5 width=634)
                                Distribute Round Robin
                                ->  XN Subquery Scan "*SELECT* 3"  (cost=1000000000020.18..1000000000020.58 rows=5 width=634)
                                ->  XN Hash Join DS_BCAST_INNER  (cost=145262411.58..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=109 width=599)
                                ->  XN Function Scan on pg_get_late_binding_view_cols cols  (cost=0.00..15.26 rows=5 width=292)
                                      Join Filter: (("inner".usename = ("current_user"())::name) OR (has_table_privilege("outer".oid, 'SELECT'::text) = true) OR (has_table_privilege("outer".oid, 'INSERT'::text) = true) OR (has_table_privilege("outer".oid, 'UPDATE'::text) = true) OR (has_table_privilege("outer".oid, 'REFERENCES'::text) = true))
                                      Hash Cond: ("outer".relowner = "inner".usesysid)
                                      Filter: (view_name = 'table'::name)
                                      ->  XN Subquery Scan svv_external_columns  (cost=1000000000020.18..1000000000020.53 rows=5 width=634)
                                      ->  XN Hash Join DS_BCAST_INNER  (cost=145262410.56..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=109 width=607)
                                      ->  XN Hash  (cost=1.01..1.01 rows=1 width=132)
                                            Hash Cond: ("outer".relnamespace = "inner".oid)
                                            ->  XN Merge  (cost=1000000000020.18..1000000000020.20 rows=5 width=168)
                                            ->  XN Hash Join DS_DIST_INNER  (cost=145262409.49..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=109 width=483)
                                            ->  XN Hash  (cost=1.06..1.06 rows=6 width=132)
                                            ->  LD Seq Scan on pg_shadow  (cost=0.00..1.01 rows=1 width=132)
                                                  Merge Key: ((btrim((ext_cols.schemaname)::text))::character varying)::character varying(128), ((btrim((ext_cols.tablename)::text))::character varying)::character varying(128), ext_cols.columnnum
                                                  Inner Dist Key: c.oid
                                                  Hash Cond: ("outer".attrelid = "inner".oid)
                                                  ->  XN Network  (cost=1000000000020.18..1000000000020.20 rows=5 width=168)
                                                  ->  XN Hash Join DS_BCAST_INNER  (cost=145262383.02..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=25597 width=475)
                                                  ->  XN Hash  (cost=26.46..26.46 rows=2 width=12)
                                                  ->  LD Seq Scan on pg_namespace nc  (cost=0.00..1.06 rows=6 width=132)
                                                        Send to leader
                                                        Hash Cond: ("outer".atttypid = "inner".oid)
                                                        ->  XN Sort  (cost=1000000000020.18..1000000000020.20 rows=5 width=168)
                                                        ->  XN Hash Left Join DS_DIST_BOTH  (cost=3839.55..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=25597 width=170)
                                                        ->  XN Hash  (cost=145258542.22..145258542.22 rows=503 width=309)
                                                        ->  LD Seq Scan on pg_class c  (cost=0.00..26.46 rows=2 width=12)
                                                              Sort Key: ((btrim((ext_cols.schemaname)::text))::character varying)::character varying(128), ((btrim((ext_cols.tablename)::text))::character varying)::character varying(128), ext_cols.columnnum
                                                              Outer Dist Key: a.attrelid
                                                              Inner Dist Key: ad.adrelid
                                                              Hash Cond: (("outer".attrelid = "inner".adrelid) AND ("outer".attnum = "inner".adnum))
                                                              Filter: ((((relname)::information_schema.sql_identifier)::text = 'table'::text) AND ((relkind = 'r'::"char") OR (relkind = 'v'::"char")))
                                                              ->  XN Hash Left Join DS_DIST_BOTH  (cost=8400035.75..145258542.22 rows=503 width=309)
                                                              ->  XN Hash  (cost=2559.70..2559.70 rows=255970 width=6)
                                                              ->  XN Function Scan on pg_get_external_columns ext_cols  (cost=0.00..20.12 rows=5 width=168)
                                                              ->  LD Seq Scan on pg_attribute a  (cost=0.00..236.21 rows=3580 width=170)
                                                                    Outer Dist Key: "outer".typbasetype
                                                                    Join Filter: ("outer".typtype = 'd'::"char")
                                                                    Inner Dist Key: bt.oid
                                                                    Hash Cond: ("outer".typbasetype = "inner".oid)
                                                                    Filter: ((attnum > 0) AND (attisdropped <> true))
                                                                    Filter: ((((btrim((tablename)::text))::character varying)::character varying(128))::text = 'table'::text)
                                                                    ->  XN Hash Join DS_BCAST_INNER  (cost=1.07..8400033.42 rows=503 width=175)
                                                                    ->  XN Hash  (cost=8400033.42..8400033.42 rows=503 width=138)
                                                                    ->  LD Seq Scan on pg_attrdef ad  (cost=0.00..2559.70 rows=255970 width=6)
                                                                          Hash Cond: ("outer".typnamespace = "inner".oid)
                                                                          ->  XN Hash Join DS_BCAST_INNER  (cost=1.07..8400033.42 rows=503 width=138)
                                                                          ->  XN Hash  (cost=1.06..1.06 rows=6 width=132)
                                                                          ->  LD Seq Scan on pg_type t  (cost=0.00..21.03 rows=503 width=51)
                                                                                Hash Cond: ("outer".typnamespace = "inner".oid)
                                                                                ->  XN Hash  (cost=1.06..1.06 rows=6 width=132)
                                                                                ->  LD Seq Scan on pg_type bt  (cost=0.00..21.03 rows=503 width=14)
                                                                                ->  LD Seq Scan on pg_namespace nt  (cost=0.00..1.06 rows=6 width=132)
                                                                                      ->  LD Seq Scan on pg_namespace nbt  (cost=0.00..1.06 rows=6 width=132)

  • After
XN Merge  (cost=999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=1 width=176)
  Merge Key: ordinal_position
  ->  XN Network  (cost=999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=1 width=176)
        Send to leader
        ->  XN Sort  (cost=999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=1 width=176)
              Sort Key: ordinal_position
              ->  XN Subquery Scan unioned  (cost=145262411.58..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=1 width=176)
                    Filter: (table_schema = 'analytics_dev_brandon'::name)
                    ->  XN Append  (cost=145262411.58..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=115 width=634)
                          ->  XN Subquery Scan "*SELECT* 1"  (cost=145262411.58..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=109 width=599)
                                ->  XN Hash Join DS_BCAST_INNER  (cost=145262411.58..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=109 width=599)
                                      Hash Cond: ("outer".relowner = "inner".usesysid)
                                      Join Filter: (("inner".usename = ("current_user"())::name) OR (has_table_privilege("outer".oid, 'SELECT'::text) = true) OR (has_table_privilege("outer".oid, 'INSERT'::text) = true) OR (has_table_privilege("outer".oid, 'UPDATE'::text) = true) OR (has_table_privilege("outer".oid, 'REFERENCES'::text) = true))
                                      ->  XN Hash Join DS_BCAST_INNER  (cost=145262410.56..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=109 width=607)
                                            Hash Cond: ("outer".relnamespace = "inner".oid)
                                            ->  XN Hash Join DS_DIST_INNER  (cost=145262409.49..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=109 width=483)
                                                  Inner Dist Key: c.oid
                                                  Hash Cond: ("outer".attrelid = "inner".oid)
                                                  ->  XN Hash Join DS_BCAST_INNER  (cost=145262383.02..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=25597 width=475)
                                                        Hash Cond: ("outer".atttypid = "inner".oid)
                                                        ->  XN Hash Left Join DS_DIST_BOTH  (cost=3839.55..999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.00 rows=25597 width=170)
                                                              Outer Dist Key: a.attrelid
                                                              Inner Dist Key: ad.adrelid
                                                              Hash Cond: (("outer".attrelid = "inner".adrelid) AND ("outer".attnum = "inner".adnum))
                                                              ->  LD Seq Scan on pg_attribute a  (cost=0.00..236.21 rows=3580 width=170)
                                                                    Filter: ((attnum > 0) AND (attisdropped <> true))
                                                              ->  XN Hash  (cost=2559.70..2559.70 rows=255970 width=6)
                                                                    ->  LD Seq Scan on pg_attrdef ad  (cost=0.00..2559.70 rows=255970 width=6)
                                                        ->  XN Hash  (cost=145258542.22..145258542.22 rows=503 width=309)
                                                              ->  XN Hash Left Join DS_DIST_BOTH  (cost=8400035.75..145258542.22 rows=503 width=309)
                                                                    Outer Dist Key: "outer".typbasetype
                                                                    Inner Dist Key: bt.oid
                                                                    Hash Cond: ("outer".typbasetype = "inner".oid)
                                                                    Join Filter: ("outer".typtype = 'd'::"char")
                                                                    ->  XN Hash Join DS_BCAST_INNER  (cost=1.07..8400033.42 rows=503 width=175)
                                                                          Hash Cond: ("outer".typnamespace = "inner".oid)
                                                                          ->  LD Seq Scan on pg_type t  (cost=0.00..21.03 rows=503 width=51)
                                                                          ->  XN Hash  (cost=1.06..1.06 rows=6 width=132)
                                                                                ->  LD Seq Scan on pg_namespace nt  (cost=0.00..1.06 rows=6 width=132)
                                                                    ->  XN Hash  (cost=8400033.42..8400033.42 rows=503 width=138)
                                                                          ->  XN Hash Join DS_BCAST_INNER  (cost=1.07..8400033.42 rows=503 width=138)
                                                                                Hash Cond: ("outer".typnamespace = "inner".oid)
                                                                                ->  LD Seq Scan on pg_type bt  (cost=0.00..21.03 rows=503 width=14)
                                                                                ->  XN Hash  (cost=1.06..1.06 rows=6 width=132)
                                                                                      ->  LD Seq Scan on pg_namespace nbt  (cost=0.00..1.06 rows=6 width=132)
                                                  ->  XN Hash  (cost=26.46..26.46 rows=2 width=12)
                                                        ->  LD Seq Scan on pg_class c  (cost=0.00..26.46 rows=2 width=12)
                                                              Filter: ((((relname)::information_schema.sql_identifier)::text = 'test_model'::text) AND ((relkind = 'r'::"char") OR (relkind = 'v'::"char")))
                                            ->  XN Hash  (cost=1.06..1.06 rows=6 width=132)
                                                  ->  LD Seq Scan on pg_namespace nc  (cost=0.00..1.06 rows=6 width=132)
                                      ->  XN Hash  (cost=1.01..1.01 rows=1 width=132)
                                            ->  LD Seq Scan on pg_shadow  (cost=0.00..1.01 rows=1 width=132)
                          ->  XN Subquery Scan "*SELECT* 2"  (cost=0.00..15.31 rows=5 width=292)
                                ->  XN Function Scan on pg_get_late_binding_view_cols cols  (cost=0.00..15.26 rows=5 width=292)
                                      Filter: (view_name = 'test_model'::name)
                          ->  XN Network  (cost=1000000000027.54..1000000000027.62 rows=1 width=634)
                                Distribute Round Robin
                                ->  XN Subquery Scan "*SELECT* 3"  (cost=1000000000027.54..1000000000027.62 rows=1 width=634)
                                      ->  XN Subquery Scan svv_external_columns  (cost=1000000000027.54..1000000000027.61 rows=1 width=634)
                                            ->  XN Merge  (cost=1000000000027.54..1000000000027.54 rows=1 width=168)
                                                  Merge Key: ((btrim((ext_cols.schemaname)::text))::character varying)::character varying(128), ((btrim((ext_cols.tablename)::text))::character varying)::character varying(128), ext_cols.columnnum
                                                  ->  XN Network  (cost=1000000000027.54..1000000000027.54 rows=1 width=168)
                                                        Send to leader
                                                        ->  XN Sort  (cost=1000000000027.54..1000000000027.54 rows=1 width=168)
                                                              Sort Key: ((btrim((ext_cols.schemaname)::text))::character varying)::character varying(128), ((btrim((ext_cols.tablename)::text))::character varying)::character varying(128), ext_cols.columnnum
                                                              ->  XN Function Scan on pg_get_external_columns ext_cols  (cost=0.00..27.52 rows=1 width=168)
                                                                    Filter: (((((btrim((schemaname)::text))::character varying)::character varying(128))::text = 'analytics_dev_brandon'::text) AND ((((btrim((tablename)::text))::character varying)::character varying(128))::text = 'test_model'::text))

@cla-bot
Copy link

cla-bot bot commented Oct 27, 2020

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, don't hesitate to ping @drewbanin.

CLA has not been signed by users: @brangisom

@cla-bot cla-bot bot added the cla:yes label Oct 27, 2020
@jtcohen6
Copy link
Contributor

@cla-bot check

@cla-bot
Copy link

cla-bot bot commented Oct 27, 2020

The cla-bot has been summoned, and re-checked this pull request!

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the easiest approve of my life. Bravo on the research, and identification of a quick fix. Just one teensy comment about the changelog.

CHANGELOG.md Outdated
@@ -34,6 +35,7 @@ Contributors:
- Added strategy-specific validation to improve the relevancy of compilation errors for the `timestamp` and `check` snapshot strategies. (([#2787](https://github.com/fishtown-analytics/dbt/issues/2787), [#2791](https://github.com/fishtown-analytics/dbt/pull/2791))
- Changed rpc test timeouts to avoid locally run test failures ([#2803](https://github.com/fishtown-analytics/dbt/issues/2803),[#2804](https://github.com/fishtown-analytics/dbt/pull/2804))
- Added a debug_query on the base adapter that will allow plugin authors to create custom debug queries ([#2751](https://github.com/fishtown-analytics/dbt/issues/2751),[#2871](https://github.com/fishtown-analytics/dbt/pull/2817))
- Fix Redshift adapter `get_columns_in_relation` macro to push schema filter down to the `svv_external_columns` view ([#2855](https://github.com/fishtown-analytics/dbt/issues/2854))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny tiny: could you move this above, under dbt 0.19.0? You can make a new "under the hood" section there. I just don't want it here because it wasn't included in the b1 release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it moved!

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution @brangisom!

@jtcohen6 jtcohen6 merged commit 74eec3b into dbt-labs:dev/kiyoshi-kuromiya Oct 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants