Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable stateful ingestion in create_cadet_databases_source #383

Open
LavMatt opened this issue Nov 27, 2024 · 1 comment
Open

Enable stateful ingestion in create_cadet_databases_source #383

LavMatt opened this issue Nov 27, 2024 · 1 comment
Assignees
Labels
DataHub Issues relating to DataHub - https://datahubproject.io/ metadata quality python Pull requests that update Python code

Comments

@LavMatt
Copy link
Contributor

LavMatt commented Nov 27, 2024

Enabling this is required for any ingestion to remove stale metadata, i.e. metadata that was there yesterday but is gone today. As is stale metadata persist in datahub and hence in find-moj-data

This was tested in the justice-data source and can be replicated for other sources.

see PR: #381
and,
datahub guidance: https://datahubproject.io/docs/metadata-ingestion/docs/dev_guides/add_stateful_ingestion_to_source/

This is the create_cadet_databases ingestion source https://github.com/ministryofjustice/data-catalogue/tree/main/ingestion/create_cadet_databases_source

Data need to be hard deleted from datahub and then reingested with stateful ingestion enabled for it to properly take effect

@github-project-automation github-project-automation bot moved this to Todo 📝 in Data Catalogue Nov 27, 2024
@mitchdawson1982 mitchdawson1982 self-assigned this Dec 6, 2024
@mitchdawson1982 mitchdawson1982 moved this from Todo 📝 to In Progress 🚀 in Data Catalogue Dec 9, 2024
@mitchdawson1982 mitchdawson1982 moved this from In Progress 🚀 to Review 🛂 in Data Catalogue Dec 9, 2024
@mitchdawson1982 mitchdawson1982 added DataHub Issues relating to DataHub - https://datahubproject.io/ python Pull requests that update Python code metadata quality labels Dec 9, 2024
@mitchdawson1982
Copy link
Contributor

PR

@mitchdawson1982 mitchdawson1982 moved this from Review 🛂 to In Progress 🚀 in Data Catalogue Dec 13, 2024
@mitchdawson1982 mitchdawson1982 moved this from In Progress 🚀 to Review 🛂 in Data Catalogue Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataHub Issues relating to DataHub - https://datahubproject.io/ metadata quality python Pull requests that update Python code
Projects
Status: Review 🛂
Development

No branches or pull requests

2 participants