You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nowadays there is no handling for possible discrepancies in the two backend we are using to store metadata. This causes CKAN to display a dataset (because it exist in Metastore) but fail when trying to edit it because it doesn't exist in metastore-lib backend (for example, data has not been migrated into github repositories).
Example traceback:
File '/usr/lib/ckan/src/ckan/ckan/logic/action/update.py', line 334 in package_update
item.after_update(context, data)
File '/usr/local/lib/python2.7/dist-packages/ckanext/versioning/plugin.py', line 149 in after_update
pkg_dict['name'], datapackage, author=author)
File '/usr/local/lib/python2.7/dist-packages/metastore/backend/github/storage.py', line 109 in update
repo = self._get_repo(package_id)
File '/usr/local/lib/python2.7/dist-packages/metastore/backend/github/storage.py', line 226 in _get_repo
raise exc.NotFound('Could not find package {}'.format(package_id))
NotFound: Could not find package testing-versions
This will also introduce a hard dependencies: we cannot change/update the metastore-lib backend without a data migration which is something that shouldn't happen but it is worth to have it in mind while we are in the development workflow.
Is this gonna be handle in a specific way?
Some scenarios I can think:
While doing data migration some repositories are not created and therefore there are datasets in CKAN that doesn't exist in the new backend.
Someone edits the git backend directly and now there are resources and metadata that no longer exists in CKAN metastore
Some process updates CKAN database directly without updating metastore-lib (EG a data migration script run directly in the database).
The text was updated successfully, but these errors were encountered:
👍 I like the idea of fault tolerance and graceful degradation. I think migrations are much easier if systems allow for eventual consistency rather than expect to be consistent 100% of the time.
My thought initially was to ensure that if a dataset doesn't exist in the metastore we rely on CKAN data and "migrate" on demand either when the dataset is saved for the first time, or when it is read for the first time (but I think this is slightly less preferred).
I am not sure how to prioritize this, but I will look into the complexity of this.
Nowadays there is no handling for possible discrepancies in the two backend we are using to store metadata. This causes CKAN to display a dataset (because it exist in Metastore) but fail when trying to edit it because it doesn't exist in
metastore-lib
backend (for example, data has not been migrated into github repositories).Example traceback:
This will also introduce a hard dependencies: we cannot change/update the
metastore-lib
backend without a data migration which is something that shouldn't happen but it is worth to have it in mind while we are in the development workflow.Is this gonna be handle in a specific way?
Some scenarios I can think:
The text was updated successfully, but these errors were encountered: