-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete table database versioning #2465
Conversation
mitchdawson1982
commented
Nov 21, 2023
•
edited
Loading
edited
- Adds support for deleting versioned databases
- No longer returns if schema.json is not found
- No longer returns if glue table not found
Signed-off-by: Jacob Woffenden <[email protected]>
* Adding kyverno configuration in Dev Co-authored-by: Jacob Woffenden <[email protected]> * Adding lint ignore for Kyverno security policies * Add missing newline --------- Co-authored-by: Jacob Woffenden <[email protected]>
Signed-off-by: Jacob Woffenden <[email protected]> Co-authored-by: Gary H <[email protected]>
Bumps [bridgecrewio/checkov-action](https://github.com/bridgecrewio/checkov-action) from 12.2580.0 to 12.2585.0. - [Release notes](https://github.com/bridgecrewio/checkov-action/releases) - [Commits](bridgecrewio/checkov-action@558f721...ef1b835) --- updated-dependencies: - dependency-name: bridgecrewio/checkov-action dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>
Bumps [hashicorp/aws](https://github.com/hashicorp/terraform-provider-aws) from 5.25.0 to 5.26.0. - [Release notes](https://github.com/hashicorp/terraform-provider-aws/releases) - [Changelog](https://github.com/hashicorp/terraform-provider-aws/blob/main/CHANGELOG.md) - [Commits](hashicorp/terraform-provider-aws@v5.25.0...v5.26.0) --- updated-dependencies: - dependency-name: hashicorp/aws dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Jacob Woffenden <[email protected]>
…-slim to 1.25.3-alpine3.18-slim in /containers/static-assets (#2446) * docker(deps): bump nginxinc/nginx-unprivileged Bumps [nginxinc/nginx-unprivileged](https://github.com/nginxinc/docker-nginx-unprivileged) from 1.25.2-alpine3.18-slim to 1.25.3-alpine3.18-slim. - [Release notes](https://github.com/nginxinc/docker-nginx-unprivileged/releases) - [Commits](https://github.com/nginxinc/docker-nginx-unprivileged/commits) --- updated-dependencies: - dependency-name: nginxinc/nginx-unprivileged dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> * 🔊 Update CHANGELOG * 📝 Line length --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Gary H <[email protected]>
Bumps [bridgecrewio/checkov-action](https://github.com/bridgecrewio/checkov-action) from 12.2585.0 to 12.2586.0. - [Release notes](https://github.com/bridgecrewio/checkov-action/releases) - [Commits](bridgecrewio/checkov-action@ef1b835...93e6fa1) --- updated-dependencies: - dependency-name: bridgecrewio/checkov-action dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>
* Move create_glue_database function into glue_utils * Move CuratedDataQueryBuilder into base image * Move CuratedDataLoader into python base * Bump version * move more potentially shared methods out of loader class into utils * update tests * remove commented out code * lint * did not mock athena * one more lint * the impports * updated changelog --------- Co-authored-by: Matt <[email protected]> Co-authored-by: LavMatt <[email protected]>
Bumps [slack-sdk](https://github.com/slackapi/python-slack-sdk) from 3.24.0 to 3.25.0. - [Release notes](https://github.com/slackapi/python-slack-sdk/releases) - [Changelog](https://github.com/slackapi/python-slack-sdk/blob/main/docs-v2/changelog.html) - [Commits](slackapi/python-slack-sdk@v3.24.0...v3.25.0) --- updated-dependencies: - dependency-name: slack-sdk dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Jacob Woffenden <[email protected]>
Signed-off-by: Jacob Woffenden <[email protected]>
6e86ba2
to
fa3d82f
Compare
def test_invalid_schemas(self, data_product_name): | ||
version_manager = VersionManager(data_product_name, logging.getLogger()) | ||
schema_list = ["schema3", "schema4"] | ||
|
||
with pytest.raises(InvalidUpdate) as exc: | ||
version_manager.update_metadata_remove_schemas(schema_list=schema_list) | ||
assert ( | ||
str(exc.value) | ||
== "Invalid schemas found in schema_list: ['schema3', 'schema4']" | ||
) | ||
|
||
def test_glue_table_not_found( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a replacement for this test? (trying to remove invalid schemas)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code used to be strict on validating if a schema existed in metadata schemas and would return, however it was pointed out that the could be scenarios where data files or tables were present, but a schema was not so this was not an optimal approach. The code will log schemas that are not present in the metadata but will attempt to remove tables and files for any provided schema value. Hence there is no longer a need to validate the supplied schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I get it now - there was no way to delete an 'orphaned' table for a Data Product through the API.
I think it may be preferable to at least return the failure message to the user '{schema_name} schema metadata not found, {schema_name} schema metadata not deleted'
, ideally together with a message of the glue table being deleted to give the full context of what's happened (as part of the response message):
# schema not found
f"{schema} schema metadata not found, {schema} schema metadata not deleted"
f"{data_product}.{schema} table was successfully deleted, and can no longer be queried"
# schema found
f"{schema} schema metadata found, {schema} schema metadata successfully deleted"
f"{data_product}.{schema} table was successfully deleted, and can no longer be queried"
* ♻️ Add database versioning support * ✅ Updates Tests * 📝 Update changelog and version files * 📝 Update documentation to remove references to test fixtures * ♻️ Utilise delete database function from glue & athena utils * ✨ Created delete database function * ✅ Refactor to utilise glue & athena utils create datebase function * 📝 Update python base change log and config version * ✏️ Fix type on data product changelog * ✏️ Fix import typo * 🎨 Update delete_glue_table function name and imports * 🚨 Fix linter warnings
I've merged the changes to glue_and_athena utils and versioning modules in this PR: #2496 We now create new versions of the athena db when something is removed, but we still need to handle the data copying: ministryofjustice/data-catalogue#65 |