Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Postgres storage list performance #2945

Merged

Conversation

yannmh
Copy link
Contributor

@yannmh yannmh commented Jun 29, 2017

Summary

Use || standard concatenation instead of the concat function to prevent sequential scans on the vault_kv_store table.

Issue

Long story short, we are running Vault 0.7.3 with the PostgreSQL physical backend, and have seen degraded performance with Vault List operations.
Investigating, we found out that List ops result in sequential scans on the table, instead of using the index set on the parent_path column of the vault_kv_store table.

What I expect

Vault List operations, with the PostgreSQL physical backend, should use the existing vault_kv_store indexes, and avoid sequential scans.

Investigation

The List query as defined in https://github.com/hashicorp/vault/blob/v0.7.3/physical/postgresql.go#L73-L75, uses the concat function, resulting on sequential scans:

enclave=> EXPLAIN SELECT DISTINCT substring(substr(path, length('test')+1) from '^.*?/') FROM "vault_kv_store" WHERE parent_path LIKE concat('test','%');
                                  QUERY PLAN
------------------------------------------------------------------------------
 Unique  (cost=2942.88..2942.89 rows=1 width=32)
   ->  Sort  (cost=2942.88..2942.89 rows=1 width=32)
         Sort Key: ("substring"(substr(path, 5), '^.*?/'::text)) COLLATE "C"
         ->  Seq Scan on vault_kv_store  (cost=0.00..2942.87 rows=1 width=32)
               Filter: (parent_path ~~ concat('test', '%'))
(5 rows)

Using the || string concatenation operator fixes the issue.

enclave=> EXPLAIN SELECT DISTINCT substring(substr(path, length('test')+1) from '^.*?/') FROM "vault_kv_store" WHERE parent_path LIKE 'test' || '%';
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Unique  (cost=8.32..8.32 rows=1 width=32)
   ->  Sort  (cost=8.32..8.32 rows=1 width=32)
         Sort Key: ("substring"(substr(path, 5), '^.*?/'::text)) COLLATE "C"
         ->  Index Scan using parent_path_idx on vault_kv_store  (cost=0.28..8.31 rows=1 width=32)
               Index Cond: ((parent_path >= 'test'::text) AND (parent_path < 'tesu'::text))
               Filter: (parent_path ~~ 'test%'::text)
(6 rows)

I believe this might be a regression introduced with #2393

Use `||` standard concatenation instead of the `concat` function in
order to use the `vault_kv_store` index on `parent_path`.
@jefferai
Copy link
Member

jefferai commented Jul 1, 2017

@ckarenz Any chance you can review this?

@jefferai jefferai added this to the 0.7.4 milestone Jul 1, 2017
@jefferai jefferai changed the title [physical][postgresql] improve list performance Improve Postgres storage list performance Jul 1, 2017
Copy link

@ckarenz ckarenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! This change looks good.

@jefferai
Copy link
Member

jefferai commented Jul 2, 2017

Great, thanks!

@jefferai jefferai merged commit ead553d into hashicorp:master Jul 2, 2017
@jefferai jefferai modified the milestones: 0.7.4, 0.8.0 Jul 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants