Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace SQLAlchemy Migrate with Alembic #13108

Closed
wants to merge 61 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
348124d
Add alembic to dependencies
jdavcs Dec 30, 2021
52a5622
Add testing infrastructure
jdavcs Dec 30, 2021
1324e21
Add alembic infrastructure
jdavcs Dec 30, 2021
80741b5
Add alembic dir to ignore list in run_tests
jdavcs Nov 9, 2021
9357c61
Add alembic dir to ignore list in package tests
jdavcs Nov 29, 2021
c0bea73
Move engine_options default arg initialization to engine_factory
jdavcs Dec 1, 2021
dd49a3e
Remove migrations test from first starup workflow
jdavcs Dec 2, 2021
bc20e0b
Add workflow for migration tests
jdavcs Dec 4, 2021
64e81ef
Run tests under Python 3.7
jdavcs Dec 21, 2021
0a1c60e
Move triggers out of migrate directory
jdavcs Dec 31, 2021
058f09a
Integrate migrations design into config and model code
jdavcs Dec 31, 2021
69eed35
Handle error when revision not found; add test case
jdavcs Jan 4, 2022
1295ac9
Avoid extra trip to db via lazy load of alembic version heads
jdavcs Jan 4, 2022
063d88b
Use correct db options for both engines; resolve todos
jdavcs Jan 4, 2022
27dde54
Fix mypy error
jdavcs Jan 5, 2022
46f8504
Add informative log/error messages; add tests for new methods
jdavcs Jan 5, 2022
80b2e1f
Drop config.database_create_tables (not used)
jdavcs Jan 5, 2022
9c4c279
Drop app.new_installation (not used)
jdavcs Jan 5, 2022
4272df1
Don't use triple-quoted strings as comments in tests
jdavcs Jan 5, 2022
d416517
Setup infrastructure for running db scripts
jdavcs Jan 6, 2022
dae4797
Add create/migrate scripts for tool shed
jdavcs Jan 6, 2022
a184f95
Add create/migrate scripts for Alembic dbs
jdavcs Jan 6, 2022
cb2f04b
Fix integration test (test only relevant for TS)
jdavcs Jan 6, 2022
beafd17
Fix mypy error
jdavcs Jan 6, 2022
d777fc1
When overriding, override.
jdavcs Jan 6, 2022
1808ccb
Update data package (requirements, packages, includes)
jdavcs Jan 6, 2022
9922111
Move helper function out of migrate into triggers
jdavcs Jan 6, 2022
b13c402
Remove galaxy.model.migrate files
jdavcs Jan 6, 2022
eebd04b
Remove galaxy.model.tool_shed_install.migrate files
jdavcs Jan 6, 2022
6c8575b
Move is-one-db check into model.database_utils
jdavcs Jan 6, 2022
6b4e393
Update data package entry point in orm.scripts.py
jdavcs Jan 6, 2022
a325964
Remove outdated parts from orm.scripts.py
jdavcs Jan 6, 2022
acdd0fa
Update db index testing: test both models, gxy and tsi
jdavcs Jan 6, 2022
a61e246
Add a comment on sqlalchemy-migrate to pyproject.toml
jdavcs Jan 6, 2022
e0593e9
Address existing mypy error (see note)
jdavcs Jan 6, 2022
2f30bbb
Dispose of engines when invoked via script
jdavcs Jan 6, 2022
cb59e6b
Add type hints to migrations modules
jdavcs Jan 9, 2022
7720228
Type-ignore alembic type hint bug
jdavcs Jan 9, 2022
2f2f1d7
Add type hints to common module in migration tests
jdavcs Jan 9, 2022
85e9a0d
Remove circular dependency between migrations and config
jdavcs Jan 10, 2022
329875a
Add migrations README
jdavcs Jan 11, 2022
56c379e
Add prev. version 180 as new revision.
jdavcs Jan 19, 2022
56cdf53
Preserve manage_db.sh, but do not hide alembic
jdavcs Jan 20, 2022
12523c9
Update README: rename script, add legacy script, simplify upgrade steps.
jdavcs Jan 20, 2022
5e5ba15
Modify behavior of legacy wrapper, add tests, refactor
jdavcs Jan 21, 2022
c6c2e11
Check for table existence for vault (see note)
jdavcs Jan 21, 2022
4c80330
Stub out config call in test_scripts; fixes package test
jdavcs Jan 21, 2022
11657d1
Update manage_db.sh ($@ > "$@")
jdavcs Jan 25, 2022
1dcce64
Update manage_db.sh ($@ > "$@")
jdavcs Jan 25, 2022
394e275
Update manage_db.sh (use single `=` in `[ ]`)
jdavcs Jan 25, 2022
8a26c3c
Fix import order in config
jdavcs Jan 31, 2022
ebe36de
Apply black, isort
jdavcs Feb 6, 2022
5c413a2
Remove old migrate files (fix rebase)
jdavcs Feb 6, 2022
ea09825
Change latest db version to 180
jdavcs Feb 9, 2022
020eef6
Adapt manage_db.py for usegalaxy playbook; add tests
jdavcs Feb 10, 2022
5e1a61c
Fix some of the package tests
jdavcs Feb 10, 2022
61411f8
Fix path for package tests
jdavcs Feb 10, 2022
3991eaf
Print result of script execution
jdavcs Feb 10, 2022
4fdec1f
One engine for 2 dbs logically equiv to 2 refs to same engine
jdavcs Feb 18, 2022
6f55002
Change handling of unrecognized revisions; improve log messages
jdavcs Feb 19, 2022
ec6d9a9
Remove vault table revision (part of 22.01)
jdavcs Feb 19, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions .github/workflows/first_startup.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,6 @@ jobs:
yarn-lock-file: 'galaxy root/client/yarn.lock'
- name: Install tox
run: pip install tox
# Use this job to test the latest migrations
- run: wget -q https://github.com/jmchilton/galaxy-downloads/raw/master/db_gx_rev_0141.sqlite
- run: mv db_gx_rev_0141.sqlite 'galaxy root'/database/universe.sqlite
- name: run tests
run: tox -e first_startup
working-directory: 'galaxy root'
Expand Down
56 changes: 56 additions & 0 deletions .github/workflows/unit-postgres.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
name: Unit w/postgres tests
on:
push:
paths-ignore:
- 'client/**'
- 'doc/**'
pull_request:
paths-ignore:
- 'client/**'
- 'doc/**'
env:
GALAXY_TEST_DBURI: 'postgresql://postgres:postgres@localhost:5432/postgres?client_encoding=utf8' # using postgres as the db
concurrency:
group: py-unit-${{ github.ref }}
cancel-in-progress: true
jobs:
test:
name: Test
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ['3.7']
services:
postgres:
image: postgres:13
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: postgres
ports:
- 5432:5432
steps:
- uses: actions/checkout@v2
with:
path: 'galaxy root'
- uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Get full Python version
id: full-python-version
shell: bash
run: echo ::set-output name=version::$(python -c "import sys; print('-'.join(str(v) for v in sys.version_info))")
- name: Cache pip dir
uses: actions/cache@v2
with:
path: ~/.cache/pip
key: pip-cache-${{ matrix.python-version }}-${{ hashFiles('galaxy root/requirements.txt') }}
- name: Cache galaxy venv
uses: actions/cache@v2
with:
path: .venv
key: gxy-venv-${{ runner.os }}-${{ steps.full-python-version.outputs.version }}-${{ hashFiles('galaxy root/requirements.txt') }}-unitdb
- name: Run tests
run: ./run_tests.sh -unit test/unit/data/model/migrations/test_migrations.py
working-directory: 'galaxy root'
16 changes: 16 additions & 0 deletions create_db.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,21 @@
#!/bin/sh

#######
# Use this script to verify the state of the Galaxy and Tool Shed Install
# database(s). If the database does not exist or is empty, it will be created
# and initialized.
# (Use create_toolshed_db.sh to create and initialize a new
# Tool Shed database.)
#
# To pass a galaxy config file, use `--galaxy-config`
#
# You may also override the galaxy database url and/or the
# tool shed install database url, as well as the database_template
# and database_encoding configuration options with env vars:
# GALAXY_CONFIG_OVERRIDE_DATABASE_CONNECTION=my-db-url ./create_db.sh
# GALAXY_INSTALL_CONFIG_OVERRIDE_DATABASE_CONNECTION=my-other-db-url ./create_db.sh
#######

cd "$(dirname "$0")"

. ./scripts/common_startup_functions.sh
Expand Down
16 changes: 16 additions & 0 deletions create_toolshed_db.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/sh

#######
# Use this script to verify the state of the Tool Shed database.
# If the database does not exist or is empty, it will be created
# and initialized.
# (For Galaxy and Tool Shed Install databases, use create_db.sh).
#######

cd "$(dirname "$0")"

. ./scripts/common_startup_functions.sh

setup_python

python ./scripts/create_toolshed_db.py "$@" tool_shed
1 change: 0 additions & 1 deletion lib/galaxy/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,6 @@ def __init__(self, fsmon=False, configure_logging=True, **kwargs) -> None:
log.debug("python path is: %s", ", ".join(sys.path))
self.name = "galaxy"
self.is_webapp = False
self.new_installation = False
# Read config file and check for errors
self.config: Any = self._register_singleton(config.Configuration, config.Configuration(**kwargs))
self.config.check()
Expand Down
98 changes: 59 additions & 39 deletions lib/galaxy/config/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,11 @@
from galaxy.containers import parse_containers_config
from galaxy.exceptions import ConfigurationError
from galaxy.model import mapping
from galaxy.model.database_utils import database_exists
from galaxy.model.tool_shed_install.migrate.check import create_or_verify_database as tsi_create_or_verify_database
from galaxy.model.database_utils import (
database_exists,
is_one_database,
)
from galaxy.model.orm.engine_factory import build_engine
from galaxy.schema.fields import BaseDatabaseIdField
from galaxy.structured_app import BasicSharedApp
from galaxy.util import (
Expand Down Expand Up @@ -724,7 +727,6 @@ def _process_config(self, kwargs):
db_path = self._in_data_dir("universe.sqlite")
self.database_connection = f"sqlite:///{db_path}?isolation_level=IMMEDIATE"
self.database_engine_options = get_database_engine_options(kwargs)
self.database_create_tables = string_as_bool(kwargs.get("database_create_tables", "True"))
self.database_encoding = kwargs.get("database_encoding") # Create new databases with this encoding
self.thread_local_log = None
if self.enable_per_request_sql_debugging:
Expand Down Expand Up @@ -1476,58 +1478,76 @@ def _configure_tool_shed_registry(self):
else:
self.tool_shed_registry = galaxy.tool_shed.tool_shed_registry.Registry()

def _configure_engines(self, db_url, install_db_url, combined_install_database):
trace_logger = getattr(self, "trace_logger", None)
engine = build_engine(
db_url,
self.config.database_engine_options,
self.config.database_query_profiling_proxy,
trace_logger,
self.config.slow_query_log_threshold,
self.config.thread_local_log,
self.config.database_log_query_counts,
)
install_engine = None
if not combined_install_database:
install_engine = build_engine(install_db_url, self.config.install_database_engine_options)
return engine, install_engine

def _configure_models(self, check_migrate_databases=False, config_file=None):
"""Preconditions: object_store must be set on self."""
# TODO this block doesn't seem to belong in this method
if getattr(self.config, "max_metadata_value_size", None):
from galaxy.model import custom_types

custom_types.MAX_METADATA_VALUE_SIZE = self.config.max_metadata_value_size

db_url = get_database_url(self.config)
install_db_url = self.config.install_database_connection
# TODO: Consider more aggressive check here that this is not the same
# database file under the hood.
combined_install_database = not (install_db_url and install_db_url != db_url)
install_db_url = install_db_url or db_url
install_database_options = (
self.config.database_engine_options
if combined_install_database
else self.config.install_database_engine_options
)
combined_install_database = is_one_database(db_url, install_db_url)
engine, install_engine = self._configure_engines(db_url, install_db_url, combined_install_database)

if self.config.database_wait:
self._wait_for_database(db_url)

if getattr(self.config, "max_metadata_value_size", None):
from galaxy.model import custom_types

custom_types.MAX_METADATA_VALUE_SIZE = self.config.max_metadata_value_size

if check_migrate_databases:
# Initialize database / check for appropriate schema version. # If this
# is a new installation, we'll restrict the tool migration messaging.
from galaxy.model.migrate.check import create_or_verify_database

create_or_verify_database(
db_url,
config_file,
self.config.database_engine_options,
app=self,
map_install_models=combined_install_database,
)
if not combined_install_database:
tsi_create_or_verify_database(install_db_url, install_database_options, app=self)

self.model = init_models_from_config(
self.config,
map_install_models=combined_install_database,
object_store=self.object_store,
trace_logger=getattr(self, "trace_logger", None),
self._verify_databases(engine, install_engine, combined_install_database)

self.model = mapping.configure_model_mapping(
self.config.file_path,
self.object_store,
self.config.use_pbkdf2,
engine,
combined_install_database,
self.config.thread_local_log,
)

if combined_install_database:
log.info("Install database targetting Galaxy's database configuration.")
log.info("Install database targeting Galaxy's database configuration.") # TODO this message is ambiguous
self.install_model = self.model
else:
from galaxy.model.tool_shed_install import mapping as install_mapping

install_db_url = self.config.install_database_connection
self.install_model = install_mapping.configure_model_mapping(install_engine)
log.info(f"Install database using its own connection {install_db_url}")
self.install_model = install_mapping.init(install_db_url, install_database_options)

def _verify_databases(self, engine, install_engine, combined_install_database):
from galaxy.model.migrations import verify_databases

install_template, install_encoding = None, None
if not combined_install_database: # Otherwise these options are not used.
install_template = getattr(self.config, "install_database_template", None)
install_encoding = getattr(self.config, "install_database_encoding", None)

verify_databases(
engine,
self.config.database_template,
self.config.database_encoding,
install_engine,
install_template,
install_encoding,
self.config.database_auto_migrate,
)

def _configure_signal_handlers(self, handlers):
for sig, handler in handlers.items():
Expand Down
1 change: 1 addition & 0 deletions lib/galaxy/dependencies/dev-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ a2wsgi==1.4.0; python_version >= "3.6" and python_version < "4.0"
adal==1.2.7
aiofiles==0.8.0; python_version >= "3.6" and python_version < "4.0"
alabaster==0.7.12; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.6"
alembic==1.7.4; python_version >= "3.6"
amqp==5.0.9; python_version >= "3.7"
anyio==3.5.0; python_version >= "3.7" and python_full_version >= "3.6.2"
appdirs==1.4.4
Expand Down
1 change: 1 addition & 0 deletions lib/galaxy/dependencies/pinned-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
a2wsgi==1.4.0; python_version >= "3.6" and python_version < "4.0"
adal==1.2.7
aiofiles==0.8.0; python_version >= "3.6" and python_version < "4.0"
alembic==1.7.4; python_version >= "3.6"
amqp==5.0.9; python_version >= "3.7"
anyio==3.5.0; python_version >= "3.7" and python_full_version >= "3.6.2"
appdirs==1.4.4
Expand Down
12 changes: 12 additions & 0 deletions lib/galaxy/model/database_utils.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import sqlite3
from contextlib import contextmanager
from typing import Optional

from sqlalchemy import create_engine
from sqlalchemy.engine.url import make_url
Expand Down Expand Up @@ -118,3 +119,14 @@ def create(self, encoding, *arg):
stmt = f"CREATE DATABASE {database} CHARACTER SET = '{encoding}'"
with engine.connect().execution_options(isolation_level="AUTOCOMMIT") as conn:
conn.execute(stmt)


def is_one_database(db1_url: str, db2_url: Optional[str]):
"""
Check if the arguments refer to one database. This will be true
if only one argument is passed, or if the urls are the same.
URLs are strings, so sameness is determined via string comparison.
"""
# TODO: Consider more aggressive check here that this is not the same
# database file under the hood.
return not (db1_url and db2_url and db1_url != db2_url)
76 changes: 43 additions & 33 deletions lib/galaxy/model/mapping.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,3 @@
"""
This module no longer contains the mapping of data model classes to the
relational database.
The module will be revised during migration from SQLAlchemy Migrate to Alembic.
"""

import logging
from threading import local
from typing import (
Expand All @@ -14,9 +8,9 @@
from galaxy import model
from galaxy.model import mapper_registry
from galaxy.model.base import SharedModelMapping
from galaxy.model.migrate.triggers.update_audit_table import install as install_timestamp_triggers
from galaxy.model.orm.engine_factory import build_engine
from galaxy.model.security import GalaxyRBACAgent
from galaxy.model.triggers.update_audit_table import install as install_timestamp_triggers
from galaxy.model.view.utils import install_views

log = logging.getLogger(__name__)
Expand All @@ -27,7 +21,6 @@
class GalaxyModelMapping(SharedModelMapping):
security_agent: GalaxyRBACAgent
thread_local_log: Optional[local]
create_tables: bool
User: Type
GalaxySession: Type

Expand All @@ -46,16 +39,7 @@ def init(
thread_local_log: Optional[local] = None,
log_query_counts=False,
) -> GalaxyModelMapping:
"""Connect mappings to the database"""
if engine_options is None:
engine_options = {}
# Connect dataset to the file path
model.Dataset.file_path = file_path
# Connect dataset to object store
model.Dataset.object_store = object_store
# Use PBKDF2 password hashing?
model.User.use_pbkdf2 = use_pbkdf2
# Load the appropriate db module
# Build engine
engine = build_engine(
url,
engine_options,
Expand All @@ -66,24 +50,50 @@ def init(
log_query_counts=log_query_counts,
)

# Create tables if needed
if create_tables:
mapper_registry.metadata.create_all(bind=engine)
create_additional_database_objects(engine)
if map_install_models:
from galaxy.model.tool_shed_install import mapping as install_mapping # noqa: F401

install_mapping.create_database_objects(engine)

# Configure model, build ModelMapping
return configure_model_mapping(file_path, object_store, use_pbkdf2, engine, map_install_models, thread_local_log)


def create_additional_database_objects(engine):
install_timestamp_triggers(engine)
install_views(engine)


def configure_model_mapping(
file_path,
object_store,
use_pbkdf2,
engine,
map_install_models,
thread_local_log,
):
_configure_model(file_path, object_store, use_pbkdf2)
return _build_model_mapping(engine, map_install_models, thread_local_log)


def _configure_model(file_path, object_store, use_pbkdf2):
model.Dataset.file_path = file_path
model.Dataset.object_store = object_store
model.User.use_pbkdf2 = use_pbkdf2


def _build_model_mapping(engine, map_install_models, thread_local_log):
model_modules = [model]
if map_install_models:
import galaxy.model.tool_shed_install.mapping # noqa: F401
from galaxy.model import tool_shed_install

galaxy.model.tool_shed_install.mapping.init(url=url, engine_options=engine_options, create_tables=create_tables)
model_modules.append(tool_shed_install)

result = GalaxyModelMapping(model_modules, engine=engine)

# Create tables if needed
if create_tables:
metadata.create_all(bind=engine)
install_timestamp_triggers(engine)
install_views(engine)

result.create_tables = create_tables
# load local galaxy security policy
result.security_agent = GalaxyRBACAgent(result)
result.thread_local_log = thread_local_log
return result
model_mapping = GalaxyModelMapping(model_modules, engine=engine)
model_mapping.security_agent = GalaxyRBACAgent(model_mapping)
model_mapping.thread_local_log = thread_local_log
return model_mapping
Loading