Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Metabase tests #20

Merged
merged 41 commits into from
Aug 21, 2023
Merged

Add Metabase tests #20

merged 41 commits into from
Aug 21, 2023

Conversation

bobbyiliev
Copy link
Contributor

@bobbyiliev bobbyiliev commented Jul 31, 2023

This is the initial PR that integrates the Materialize driver with the Metabase tests.

@bobbyiliev bobbyiliev marked this pull request as ready for review August 15, 2023 13:48
Copy link
Contributor

@joacoc joacoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving it since I don't have any refutation to anything, only questions to understand how it works and gain some knowledge.

[driver {:keys [database-name]}]
(format "DROP DATABASE IF EXISTS \"%s\";" (ddl.i/format-name driver database-name)))

(defmethod sql.tx/add-fk-sql :materialize [& _] nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if Metabase tries to add a FK in Materialize? Is that a possible scenario?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key part is this config in the driver itself:

(doseq [[feature supported?] {:foreign-keys              (not config/is-test?)

By default Metabase assumes that it needs foreign keys in order to do joins, but this is not the case for Materialize and some other DB drivers that they have. So we disable this only during the tests, as otherwise the Metatabase test suite tries to add FKs on the tables which we don't support.

resources/metabase-plugin.yaml Outdated Show resolved Hide resolved
image: postgres:15.3-alpine3.18
depends_on:
- materialize
command: sh -c 'echo "Waiting for materialized to start..." && sleep 15 && psql -h materialize -U mz_system -d materialize -p 6877 -c "ALTER SYSTEM SET max_tables = 1000;"'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only because Metabase create many tables during testing? Does Metabase also creates tables in prod?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only for the tests. Metabase creates a lot of tables when initializing the tests. Then it uses the test data to run all the tests.

CONTRIBUTING.md Outdated Show resolved Hide resolved
@@ -3,29 +3,71 @@
(:require [clojure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct to say they we use exactly the Postgres Driver and disable all the troublemaker options (set-timezone, datetime-diff, etc.)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep exactly! We first register the Materialize driver and define Postgres as the parent driver, eg:

(driver/register! :materialize, :parent :postgres)

It is very similar to how the Redshift driver is setup as well.

test/metabase/driver/materialize_test.clj Outdated Show resolved Hide resolved
run: yarn build-static-viz

# Use custom deps.edn containing "user/materialize" alias to include driver sources
- name: Run tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test takes 12 minutes because it tests all the drivers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it only tests Materialize as we add the DRIVERS=materialize before the clojure command. It's just that there are too many tests, around 15k assertions.

- name: Setup Node
uses: actions/setup-node@v2
with:
node-version: "18"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metabase 46 uses node 16, starting with 47 we use node 18. (For building static-viz this might not matter though.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing that out. I've gone ahead and reverted back to node 16 for the time being.

CONTRIBUTING.md Outdated

```bash
mkdir -p ~/.clojure
cat modules/drivers/materialize/.github/deps.edn | sed -e "s|PWD|$PWD|g" > ~/.clojure/deps.edn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't recommend executing this. Many Clojure developers customize their personal deps.edn file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I've changed the instructions accordingly.

[metabase [config :as config] [driver :as driver] [util :as u]]
[metabase.driver.sql-jdbc.execute :as sql-jdbc.execute]
[metabase.driver.sql.query-processor :as sql.qp]
[metabase.util.honey-sql-2 :as h2x]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question of style: it's unconventional to use hierarchical :requires.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I've updated this accordingly.

(sql-jdbc.common/handle-additional-options
(merge
{:classname "org.postgresql.Driver"
:subprotocol "postgresql"
:subname (str "//" host ":" port "/" db)
:ssl true
:subname (str "//" host ":" port "/" db "?options=--cluster%3D" cluster)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these values be sanitized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just updated this, let me know if this is what you had in mind

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that host, port, etc. are user input, we cannot trust them to be safe. We construct a string from them and pass that to the driver. Since we probably cannot fully trust the driver either, we should do everything reasonable to make sure there is nothing suspicious/bad in that string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any suggestions for best practices?
I tried to check how other similar drivers handle this like Redshift here and some other partner drivers like Exasol and ClickHouse but I belive that they also just construct a string and pass it to the driver.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the underlying driver. I'm guessing that

  • the resulting string should be a valid URI,
  • host should be a host name or IP address,
  • port should be an integer,
  • db should be a valid database name, and
  • cluster should be a valid cluster name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently out of ideas on a good way to fulfill this request, as this seems to be the way that it's currently handled with all other drivers:

https://github.com/metabase/metabase/blob/master/src/metabase/db/spec.clj#L24-L29

test/metabase/driver/materialize_test.clj Show resolved Hide resolved

(set! *warn-on-reflection* true)

(defmethod driver/database-supports? [:materialize :foreign-keys] [_driver _feature _db] (not config/is-test?))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's odd to have this method defined in two places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Removed the duplicate one!

Comment on lines 29 to 32
(merge
((get-method tx/aggregate-column-info ::tx/test-extensions) driver ag-type field)
(when (= ag-type :sum)
{:base_type :type/BigInteger}))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(merge
((get-method tx/aggregate-column-info ::tx/test-extensions) driver ag-type field)
(when (= ag-type :sum)
{:base_type :type/BigInteger}))))
(cond-> ((get-method tx/aggregate-column-info ::tx/test-extensions) driver ag-type field)
(= ag-type :sum) (assoc :base_type :type/BigInteger))))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this suggestion!

:type/JSON "JSON"
:type/Time "TIME"
:type/UUID "UUID"}]
(defmethod sql.tx/field-base-type->sql-type [:materialize base-type] [_ _] db-type))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these inherited from the postgres driver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't specify those, one of the tests is failing with the following error:

clojure.lang.ExceptionInfo: Failed to create :materialize 'office-checkins' test database: No test data type mapping for driver :materialize for base type :type/DateTimeWithZoneOffset; add an impl for field-base-type->sql-type.
    database-name: "office-checkins"
           driver: :materialize
       java.lang.Exception: No test data type mapping for driver :materialize for base type :type/DateTimeWithZoneOffset; add an impl for field-base-type->sql-type.

@bobbyiliev bobbyiliev requested a review from metamben August 17, 2023 15:25
@bobbyiliev bobbyiliev requested a review from metamben August 18, 2023 16:24
(defn- validate-connection-details
[{:keys [host]}]
(when-not (re-matches #"^[a-zA-Z0-9.-]+$" host)
(throw (IllegalArgumentException. (str "Invalid host: " host)))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume port should be either an integer or a string encoding one.

@bobbyiliev bobbyiliev merged commit 54c7e18 into master Aug 21, 2023
@bobbyiliev bobbyiliev deleted the metabase-tests branch August 21, 2023 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants