Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete table database versioning #2465

Closed
wants to merge 32 commits into from

Conversation

mitchdawson1982
Copy link
Contributor

@mitchdawson1982 mitchdawson1982 commented Nov 21, 2023

  • Adds support for deleting versioned databases
  • No longer returns if schema.json is not found
  • No longer returns if glue table not found

@mitchdawson1982 mitchdawson1982 requested a review from a team November 21, 2023 15:03
Jacob Woffenden and others added 15 commits November 21, 2023 15:11
Signed-off-by: Jacob Woffenden <[email protected]>
* Adding kyverno configuration in Dev

Co-authored-by: Jacob Woffenden <[email protected]>

* Adding lint ignore for Kyverno security policies

* Add missing newline

---------

Co-authored-by: Jacob Woffenden <[email protected]>
Signed-off-by: Jacob Woffenden <[email protected]>
Co-authored-by: Gary H <[email protected]>
Bumps [bridgecrewio/checkov-action](https://github.com/bridgecrewio/checkov-action) from 12.2580.0 to 12.2585.0.
- [Release notes](https://github.com/bridgecrewio/checkov-action/releases)
- [Commits](bridgecrewio/checkov-action@558f721...ef1b835)

---
updated-dependencies:
- dependency-name: bridgecrewio/checkov-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [hashicorp/aws](https://github.com/hashicorp/terraform-provider-aws) from 5.25.0 to 5.26.0.
- [Release notes](https://github.com/hashicorp/terraform-provider-aws/releases)
- [Changelog](https://github.com/hashicorp/terraform-provider-aws/blob/main/CHANGELOG.md)
- [Commits](hashicorp/terraform-provider-aws@v5.25.0...v5.26.0)

---
updated-dependencies:
- dependency-name: hashicorp/aws
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Jacob Woffenden <[email protected]>
…-slim to 1.25.3-alpine3.18-slim in /containers/static-assets (#2446)

* docker(deps): bump nginxinc/nginx-unprivileged

Bumps [nginxinc/nginx-unprivileged](https://github.com/nginxinc/docker-nginx-unprivileged) from 1.25.2-alpine3.18-slim to 1.25.3-alpine3.18-slim.
- [Release notes](https://github.com/nginxinc/docker-nginx-unprivileged/releases)
- [Commits](https://github.com/nginxinc/docker-nginx-unprivileged/commits)

---
updated-dependencies:
- dependency-name: nginxinc/nginx-unprivileged
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* 🔊 Update CHANGELOG

* 📝 Line length

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gary H <[email protected]>
Bumps [bridgecrewio/checkov-action](https://github.com/bridgecrewio/checkov-action) from 12.2585.0 to 12.2586.0.
- [Release notes](https://github.com/bridgecrewio/checkov-action/releases)
- [Commits](bridgecrewio/checkov-action@ef1b835...93e6fa1)

---
updated-dependencies:
- dependency-name: bridgecrewio/checkov-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
* Move create_glue_database function into glue_utils

* Move CuratedDataQueryBuilder into base image

* Move CuratedDataLoader into python base

* Bump version

* move more potentially shared methods out of loader class into utils

* update tests

* remove commented out code

* lint

* did not mock athena

* one more lint

* the impports

* updated changelog

---------

Co-authored-by: Matt <[email protected]>
Co-authored-by: LavMatt <[email protected]>
Bumps [slack-sdk](https://github.com/slackapi/python-slack-sdk) from 3.24.0 to 3.25.0.
- [Release notes](https://github.com/slackapi/python-slack-sdk/releases)
- [Changelog](https://github.com/slackapi/python-slack-sdk/blob/main/docs-v2/changelog.html)
- [Commits](slackapi/python-slack-sdk@v3.24.0...v3.25.0)

---
updated-dependencies:
- dependency-name: slack-sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
@mitchdawson1982 mitchdawson1982 requested a review from a team November 21, 2023 15:19
@github-actions github-actions bot added github-workflow Pull requests that update workflows devcontainer labels Nov 21, 2023
@github-actions github-actions bot removed github-workflow Pull requests that update workflows devcontainer labels Nov 21, 2023
@mitchdawson1982 mitchdawson1982 force-pushed the delete-table-database-versioning branch from 6e86ba2 to fa3d82f Compare November 21, 2023 17:09
Comment on lines -444 to -455
def test_invalid_schemas(self, data_product_name):
version_manager = VersionManager(data_product_name, logging.getLogger())
schema_list = ["schema3", "schema4"]

with pytest.raises(InvalidUpdate) as exc:
version_manager.update_metadata_remove_schemas(schema_list=schema_list)
assert (
str(exc.value)
== "Invalid schemas found in schema_list: ['schema3', 'schema4']"
)

def test_glue_table_not_found(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a replacement for this test? (trying to remove invalid schemas)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code used to be strict on validating if a schema existed in metadata schemas and would return, however it was pointed out that the could be scenarios where data files or tables were present, but a schema was not so this was not an optimal approach. The code will log schemas that are not present in the metadata but will attempt to remove tables and files for any provided schema value. Hence there is no longer a need to validate the supplied schema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get it now - there was no way to delete an 'orphaned' table for a Data Product through the API.

I think it may be preferable to at least return the failure message to the user '{schema_name} schema metadata not found, {schema_name} schema metadata not deleted', ideally together with a message of the glue table being deleted to give the full context of what's happened (as part of the response message):

# schema not found
f"{schema} schema metadata not found, {schema} schema metadata not deleted"
f"{data_product}.{schema} table was successfully deleted, and can no longer be queried"

# schema found
f"{schema} schema metadata found, {schema} schema metadata successfully deleted"
f"{data_product}.{schema} table was successfully deleted, and can no longer be queried"

mitchdawson1982 and others added 7 commits November 22, 2023 10:37
* ♻️ Add database versioning support

* ✅ Updates Tests

* 📝 Update changelog  and version files

* 📝 Update documentation to remove references to test fixtures

* ♻️ Utilise delete database function from glue & athena utils

* ✨ Created delete database function

* ✅ Refactor to utilise glue & athena utils create datebase function

* 📝 Update python base change log and config version

* ✏️ Fix type on data product changelog

* ✏️ Fix import typo

* 🎨 Update delete_glue_table function name and imports

* 🚨 Fix linter warnings
@MatMoore
Copy link
Contributor

I've merged the changes to glue_and_athena utils and versioning modules in this PR: #2496

We now create new versions of the athena db when something is removed, but we still need to handle the data copying: ministryofjustice/data-catalogue#65

@MatMoore MatMoore closed this Nov 28, 2023
@mitchdawson1982 mitchdawson1982 deleted the delete-table-database-versioning branch April 18, 2024 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants