-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: welcome presto to the suite of tested databases #10498
Changes from 4 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -92,6 +92,66 @@ jobs: | |
- name: Test babel extraction | ||
run: flask fab babel-extract --target superset/translations --output superset/translations/messages.pot --config superset/translations/babel.cfg -k _,__,t,tn,tct | ||
|
||
test-postgres-presto: | ||
runs-on: ubuntu-18.04 | ||
strategy: | ||
matrix: | ||
# run unit tests in multiple version just for fun | ||
python-version: [3.8] | ||
env: | ||
PYTHONPATH: ${{ github.workspace }} | ||
SUPERSET_CONFIG: tests.superset_test_config | ||
REDIS_PORT: 16379 | ||
SUPERSET__SQLALCHEMY_DATABASE_URI: | ||
postgresql+psycopg2://superset:[email protected]:15432/superset | ||
SUPERSET__SQLALCHEMY_EXAMPLES_URI: | ||
presto://localhost:15433/memory/default | ||
services: | ||
postgres: | ||
image: postgres:10-alpine | ||
env: | ||
POSTGRES_USER: superset | ||
POSTGRES_PASSWORD: superset | ||
ports: | ||
# Use custom ports for services to avoid accidentally connecting to | ||
# GitHub action runner's default installations | ||
- 15432:5432 | ||
presto: | ||
image: prestosql/presto:339 | ||
env: | ||
POSTGRES_USER: superset | ||
POSTGRES_PASSWORD: superset | ||
ports: | ||
# Use custom ports for services to avoid accidentally connecting to | ||
# GitHub action runner's default installations | ||
- 15433:8080 | ||
redis: | ||
image: redis:5-alpine | ||
ports: | ||
- 16379:6379 | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Setup Python | ||
uses: actions/[email protected] | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
- name: Install dependencies | ||
uses: apache-superset/cached-dependencies@b90713b | ||
with: | ||
run: | | ||
apt-get-install | ||
pip-upgrade | ||
pip install -r requirements/testing.txt | ||
setup-postgres | ||
- name: Run celery | ||
run: celery worker --app=superset.tasks.celery_app:app -Ofair -c 2 & | ||
- name: Python unit tests (PostgreSQL) | ||
run: | | ||
./scripts/python_tests.sh | ||
- name: Upload code coverage | ||
run: | | ||
bash <(curl -s https://codecov.io/bash) -cF python | ||
|
||
test-postgres: | ||
runs-on: ubuntu-18.04 | ||
strategy: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -54,19 +54,26 @@ def gen_filter( | |
|
||
def load_data(tbl_name: str, database: Database, sample: bool = False) -> None: | ||
pdf = pd.read_json(get_example_data("birth_names.json.gz")) | ||
pdf.ds = pd.to_datetime(pdf.ds, unit="ms") | ||
if database.backend == "presto": | ||
pdf.ds = pd.to_datetime(pdf.ds, unit="ms") | ||
pdf.ds = pdf.ds.dt.strftime("%Y-%m-%d %H:%M%:%S") | ||
else: | ||
pdf.ds = pd.to_datetime(pdf.ds, unit="ms") | ||
pdf = pdf.head(100) if sample else pdf | ||
|
||
pdf.to_sql( | ||
tbl_name, | ||
database.get_sqla_engine(), | ||
if_exists="replace", | ||
chunksize=500, | ||
dtype={ | ||
"ds": DateTime, | ||
# TODO(bkyryliuk): use TIMESTAMP type for presto | ||
"ds": DateTime if database.backend != "presto" else String(255), | ||
Comment on lines
+71
to
+72
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here, e.g. |
||
"gender": String(16), | ||
"state": String(10), | ||
"name": String(255), | ||
}, | ||
method="multi", | ||
index=False, | ||
) | ||
print("Done loading table!") | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1022,6 +1022,13 @@ def get_example_database() -> "Database": | |
return get_or_create_db("examples", db_uri) | ||
|
||
|
||
def get_main_database() -> "Database": | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We usually refer to this as the "metadata" database - should that be in the function name? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it is called in superset as a main database on the dev installation, I am fine with renaming it - however we should probably do it across the board in a separate PR There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, forgive the confusion on my part. This is fine as-is then. |
||
from superset import conf | ||
|
||
db_uri = conf.get("SQLALCHEMY_DATABASE_URI") | ||
return get_or_create_db("main", db_uri) | ||
|
||
|
||
def is_adhoc_metric(metric: Metric) -> bool: | ||
return bool( | ||
isinstance(metric, dict) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic should ideally be moved out to
db_engine_specs/presto.py
to keep this clean of db-specific logic. Something along the lines ofBaseEngineSpec.convert_examples_datetime()
or similar (there are other similar methods there).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my plan here is a bit different, we have talked about getting rid of load_examples for the test, I plan to move initialization code into the pytest fixture and will do that cleanup & restructure code to be more database generic.
The scope of the changes is test only, I prefer not to modify production code.