Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: validate_parameters for GSheets #15578

Merged

Conversation

betodealmeida
Copy link
Member

@betodealmeida betodealmeida commented Jul 7, 2021

SUMMARY

Add a method to the GSheets DB engine spec to validate parameters from the new DB modal. This is currently not used, since GSheets is not exposed to the new modal.

The validation takes a payload that looks like this:

    parameters = {
        "catalog": {
            "private_sheet": "https://docs.google.com/spreadsheets/d/1/edit",
            "public_sheet": "https://docs.google.com/spreadsheets/d/1/edit#gid=1",
            "not_a_sheet": "https://www.google.com/",
        },
        "credentials_info": "SECRET",
    }

It checks each URL to see if it's accessible, using the credentials if present, and returns a SIP-40 error payload indicating any URLs not are not accessible.

Shoutout to @ofekisr for decoupling unit tests from integration tests! The tests I added are unit tests, though they still require an app context. I created a small fixture to provide that, and added the pytest-mock plugin since it helps keeping tests cleaner.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A

TESTING INSTRUCTIONS

Currently this method is a no-op, but I added unit tests and did manual integration test.

$ pytest tests/unit_tests
...
3 passed, 19 warnings in 2.68s

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@betodealmeida betodealmeida force-pushed the gsheets_validate_parameters branch from 10d7a37 to 3ec3646 Compare July 7, 2021 21:30
@betodealmeida betodealmeida changed the title WIP feat: validate_parameters for GSheets Jul 7, 2021
@betodealmeida betodealmeida requested a review from hughhhh July 7, 2021 21:30

import pytest

from superset.app import create_app
Copy link
Contributor

@ofekisr ofekisr Jul 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put the imports inside the fixture since loading superset will do a lot of unnecessary loadings

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. Will fix it. But keep in mind that just importing create_app doesn't do any initialization, and the import is reasonably quick.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. Will fix it. But keep in mind that just importing create_app doesn't do any initialization, and the import is reasonably quick.

superset package will load superset extension and then it will load others and initialize many objects there

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, for sure. I was talking about how long it used to take before we moved things into create_app — then it would take many seconds to import anything from Superset.

@betodealmeida betodealmeida force-pushed the gsheets_validate_parameters branch from 3ec3646 to 178ccf8 Compare July 7, 2021 21:39
@pull-request-size pull-request-size bot added size/XL and removed size/L labels Jul 7, 2021
@codecov
Copy link

codecov bot commented Jul 7, 2021

Codecov Report

Merging #15578 (17c7ab5) into master (c732d2d) will decrease coverage by 0.21%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #15578      +/-   ##
==========================================
- Coverage   76.95%   76.73%   -0.22%     
==========================================
  Files         976      976              
  Lines       51293    51318      +25     
  Branches     6907     6907              
==========================================
- Hits        39471    39378      -93     
- Misses      11603    11721     +118     
  Partials      219      219              
Flag Coverage Δ
hive ?
mysql 81.56% <100.00%> (+0.01%) ⬆️
postgres 81.58% <100.00%> (+0.01%) ⬆️
presto ?
python 81.67% <100.00%> (-0.43%) ⬇️
sqlite 81.19% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/db_engine_specs/gsheets.py 75.00% <100.00%> (+16.66%) ⬆️
superset/db_engines/hive.py 0.00% <0.00%> (-82.15%) ⬇️
superset/db_engine_specs/hive.py 69.44% <0.00%> (-17.07%) ⬇️
superset/db_engine_specs/presto.py 83.36% <0.00%> (-6.95%) ⬇️
superset/views/database/mixins.py 81.03% <0.00%> (-1.73%) ⬇️
superset/connectors/sqla/models.py 88.26% <0.00%> (-1.65%) ⬇️
superset/db_engine_specs/base.py 88.14% <0.00%> (-0.40%) ⬇️
superset/models/core.py 89.76% <0.00%> (-0.27%) ⬇️
superset/utils/core.py 88.97% <0.00%> (-0.13%) ⬇️
superset/dashboards/schemas.py 99.37% <0.00%> (+<0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c732d2d...17c7ab5. Read the comment docs.

Copy link
Contributor

@ofekisr ofekisr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see the first unit test 👍


import pytest

from superset.app import create_app
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. Will fix it. But keep in mind that just importing create_app doesn't do any initialization, and the import is reasonably quick.

superset package will load superset extension and then it will load others and initialize many objects there

@@ -33,6 +36,12 @@
SYNTAX_ERROR_REGEX = re.compile('SQLError: near "(?P<server_error>.*?)": syntax error')


class GSheetsParametersType(TypedDict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be careful, TypedDict added in 3.8 so ... maybe the name should be changed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""


def test_validate_parameters_simple(mocker, app_context):
Copy link
Contributor

@ofekisr ofekisr Jul 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend putting all the tests under a Class so you can organize them more easily the tests: skip, mark or annotate the class with a fixture that will be applied for all the tests there

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clearly documented but when we switched to pytest we decided to stop using classes and adopt a functional approach. There's some discussion here: #12680 (comment)

@pull-request-size pull-request-size bot added size/L and removed size/XL labels Jul 7, 2021
@@ -33,6 +36,12 @@
SYNTAX_ERROR_REGEX = re.compile('SQLError: near "(?P<server_error>.*?)": syntax error')


class GSheetsParametersType(TypedDict):
credentials_info: Dict[str, Any]
query: Dict[str, Any]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gshees doesn't need query

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it does! The user can definitely pass options to it via the query string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the query string for each individual url in the catalog?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's to pass parameters to configure the engine, eg:

gsheets://?list_all_sheets=1

This makes the dropdown show not only sheets in the catalog, but all of the user's sheets.

This can be done in the "extra" session as well, to be clear.

class GSheetsParametersType(TypedDict):
credentials_info: Dict[str, Any]
query: Dict[str, Any]
catalog: Dict[str, str]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we name this table_catalog

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

@betodealmeida betodealmeida merged commit 4f5f928 into apache:master Jul 8, 2021
# specific language governing permissions and limitations
# under the License.

import pytest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@betodealmeida we could look at using pytest-flask.

cccs-RyanS pushed a commit to CybercentreCanada/superset that referenced this pull request Dec 17, 2021
* feat: validate_parameters for GSheets

* Move import inside fixture

* Update deps

* Rename parameter
QAlexBall pushed a commit to QAlexBall/superset that referenced this pull request Dec 29, 2021
* feat: validate_parameters for GSheets

* Move import inside fixture

* Update deps

* Rename parameter
cccs-rc pushed a commit to CybercentreCanada/superset that referenced this pull request Mar 6, 2024
* feat: validate_parameters for GSheets

* Move import inside fixture

* Update deps

* Rename parameter
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.3.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L 🚢 1.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants