SQL analyser for Vertica. A wrapper of uber/queryparser.
Features:
- Provides the following information (check out uber/queryparser for more detail):
- What columns were accessed and in what clauses (
SORT
,GROUP
, etc.) - What are the join columns (
JOIN ON
) - What tables were accessed
- Table lineage
- What columns were accessed and in what clauses (
- Built in catalog usage
- Simple json output
- HTTP server mode
Build:
stack setup
stack build
Docker:
docker pull lukasmi/vertica-query-parser:latest
(lukasmi/vertica-query-analyser)
Usage:
vq-analyser catalog_file [query_file]
- specify catalog file and optionally query file, if query file is not specified then stdin is used.vq-analyser catalog_file -d directory
- specify catalog file and directory in which all*.sql
files will be analysed. Will create files after the original ones:<original>.sql.json
on analysis success and<original>.sql.txt
on failure.vq-analyser catalog_file -s -p port
- specify catalog file and port on which to run http server- API
200
andapplication/json
on success400
andtext/plain
when input is incorrect (unparsable/unresolvable)404
andtext/plain
when resource path is specified405
andtext/plain
when not usingPOST
500
andtext/plain
on server internal error503
andtext/plain
on server processing timeout
- API
Examples:
catalog.sql
:
CREATE SCHEMA demo;
CREATE TABLE demo.foo (a INT, b INT);
CREATE TABLE demo.bar AS SELECT * FROM demo.foo;
Commands (pipes to stedolan/jq to format the json):
- Single file -
vq-analyser catalog.sql queries.sql | jq '.'
- Using docker -
docker run -it -v /host/path:/container/path lukasmi/vertica-query-analyser /container/path/catalog.sql /container/path/queries.sql | jq '.'
- Using docker -
- Using http server mode:
- Start the server -
vq-analyser catalog.sql -s -p 3000
- Using docker -
docker run -it -p 3000:3000 -v /host/path:/container/path lukasmi/vertica-query-analyser /container/path/catalog.sql -s -p 3000
- Using docker -
- Issue POST -
cat queries.sql | curl localhost:3000 --data-binary @- | jq '.'
- Start the server -
- Column resolving
queries.sql
:
SELECT * FROM demo.bar WHERE a IS NOT NULL ORDER BY a;
- output:
[ { "statement": "SELECT", "lineage": [], "columns": [ { "clause": "ORDER", "column": "demo.bar.a" }, { "clause": "SELECT", "column": "demo.bar.a" }, { "clause": "WHERE", "column": "demo.bar.a" }, { "clause": "SELECT", "column": "demo.bar.b" } ], "tables": [ "demo.bar" ], "joins": [] } ]
- Joins
queries.sql
:
SELECT foo.a, foo.b FROM demo.foo foo JOIN demo.bar bar ON foo.a = bar.a;
- output:
[ { "statement": "SELECT", "lineage": [], "columns": [ { "clause": "JOIN", "column": "demo.bar.a" }, { "clause": "JOIN", "column": "demo.foo.a" }, { "clause": "SELECT", "column": "demo.foo.a" }, { "clause": "SELECT", "column": "demo.foo.b" } ], "tables": [ "demo.bar", "demo.foo" ], "joins": [ { "left": "demo.bar.a", "right": "demo.foo.a" } ] } ]
- Lineage
queries.sql
:
CREATE TABLE demo.baz AS SELECT foo.a, foo.b FROM demo.foo foo JOIN demo.bar bar ON foo.a = bar.a;
- output
[ { "statement": "CREATE_TABLE", "lineage": [ { "decedent": "demo.baz", "ancestors": [ "demo.bar", "demo.foo" ] } ], "columns": [ { "clause": "JOIN", "column": "demo.bar.a" }, { "clause": "JOIN", "column": "demo.foo.a" }, { "clause": "SELECT", "column": "demo.foo.a" }, { "clause": "SELECT", "column": "demo.foo.b" } ], "tables": [ "demo.bar", "demo.baz", "demo.foo" ], "joins": [ { "left": "demo.bar.a", "right": "demo.foo.a" } ] } ]