Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Error message for VersionNotFoundError to handle Permission related issues better #1881

Merged
merged 10 commits into from
Oct 3, 2022

Conversation

ankatiyar
Copy link
Contributor

@ankatiyar ankatiyar commented Sep 27, 2022

Signed-off-by: Ankita Katiyar [email protected]

Description

Resolves #1768

Development notes

Fresh PR to resolve the issue.
Closing #1809 in favour of this solution - PermissionError was being handled outside of Kedro in some cases. This PR is a workaround as the error message is insufficient and it's hard to pinpoint the source of error precisely in case of cloud storage.

Proposed Solution

As a result, this PR modifies the error message for VersionNotFoundError to indicate that this error could be happening due to insufficient permission in case of cloud storage i.e. S3, GCS etc.

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes

@noklam noklam changed the title Update message for VersionNotFoundError Update Error message for VersionNotFoundError to handle Permission related issues better Sep 28, 2022
@noklam
Copy link
Contributor

noklam commented Sep 28, 2022

@ankatiyar Don't forget to document how this decision was made!

@ankatiyar ankatiyar marked this pull request as ready for review September 29, 2022 13:39
@ankatiyar ankatiyar requested a review from idanov as a code owner September 29, 2022 13:39
@@ -652,3 +652,14 @@ def test_replacing_nonword_characters(self):
assert "ds2_spark" in catalog.datasets.__dict__
assert "ds3__csv" in catalog.datasets.__dict__
assert "jalapeño" in catalog.datasets.__dict__

def test_no_version_cloud(self):
Copy link
Contributor

@noklam noklam Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_no_version_cloud(self):
def test_no_versions_with_cloud_protocol(self):

Just keeping it consistent as the other codebase use test_no_versions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change in 211c449

def test_no_version_cloud(self):
"""Check the error if no versions are available for load from cloud storage"""
version = Version(load=None, save=None)
ds = CSVDataSet("s3://bucket/file.csv", version=version)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ds = CSVDataSet("s3://bucket/file.csv", version=version)
versioned_dataset = CSVDataSet("s3://bucket/file.csv", version=version)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change in 211c449

Comment on lines 657 to 665
"""Check the error if no versions are available for load from cloud storage"""
version = Version(load=None, save=None)
ds = CSVDataSet("s3://bucket/file.csv", version=version)
pattern = re.escape(
f"Did not find any versions for {ds} "
f"This could be due to insufficient permission."
)
with pytest.raises(DataSetError, match=pattern):
ds.load()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    def test_no_versions(self, versioned_plot_writer):
        """Check the error if no versions are available for load."""
        pattern = r"Did not find any versions for MatplotlibWriter\(.+\)"
        with pytest.raises(DataSetError, match=pattern):
            versioned_plot_writer.load()

I found this test in one of our datasets test, can we follow the same pattern here? I think "rf" should be the same as escape.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried using "rf" but it doesn't seem to work.

Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for sticking with it, I think this is a tough one as the first few issues. It's in a good shape already, most of my comments are around the styling.

A good tip is always check the existing codebase and see if there are something similar exist.

@ankatiyar ankatiyar requested a review from yetudada as a code owner September 29, 2022 15:26
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check that I understand this correctly: the error returned by fsspec isn't a PermissionError and that's why we cannot know for sure that there is something wrong with permission?

kedro/io/core.py Outdated
raise VersionNotFoundError(f"Did not find any versions for {self}")
if protocol in CLOUD_PROTOCOLS:
message = (
f"Did not find any versions for {self} This could be "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"Did not find any versions for {self} This could be "
f"Did not find any versions for {self}. This could be "

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MerelTheisenQB Correct, it's possible to get an error with some other functions , but all datasets rely on self._fs.glob and it won't give you any error with permission issues.

It seems that this is the expected behavior of glob. For example, if you actually ls the system, you will get a permission error. However, this is a very expensive operation especially with object stores, thus I didn't like it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change in 211c449

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @noklam 🙂

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was definitely a lot trickier than we thought, but this solution makes sense to me 👍

Don't forget to add the change in the release notes.

Copy link
Contributor

@AhdraMeraliQB AhdraMeraliQB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this!! Thanks for sticking with it 😄

@ankatiyar ankatiyar merged commit 929249b into main Oct 3, 2022
@ankatiyar ankatiyar deleted the feat/catch-error-objectstores branch October 3, 2022 10:30
AhdraMeraliQB added a commit that referenced this pull request Oct 21, 2022
…related issues better (#1881)

* Update message for VersionNotFoundError

Signed-off-by: Ankita Katiyar <[email protected]>

* Add test for VersionNotFoundError for cloud protocols

* Update test_data_catalog.py

Update NoVersionFoundError test

* minor linting update

* update docs link + styling changes

* Revert "update docs link + styling changes"

This reverts commit 6088e00.

* Update test with styling changes

* Update RELEASE.md

Signed-off-by: ankatiyar <[email protected]>

Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: ankatiyar <[email protected]>
Co-authored-by: Ahdra Merali <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>
nickolasrm pushed a commit to ProjetaAi/kedro that referenced this pull request Oct 26, 2022
…related issues better (kedro-org#1881)

* Update message for VersionNotFoundError

Signed-off-by: Ankita Katiyar <[email protected]>

* Add test for VersionNotFoundError for cloud protocols

* Update test_data_catalog.py

Update NoVersionFoundError test

* minor linting update

* update docs link + styling changes

* Revert "update docs link + styling changes"

This reverts commit 6088e00.

* Update test with styling changes

* Update RELEASE.md

Signed-off-by: ankatiyar <[email protected]>

Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: ankatiyar <[email protected]>
Co-authored-by: Ahdra Merali <[email protected]>
Signed-off-by: nickolasrm <[email protected]>
AhdraMeraliQB added a commit that referenced this pull request Nov 9, 2022
* Release/0.18.3 (#1856)

* Update release version and release notes

Signed-off-by: Nok Chan <[email protected]>

* Update missing release notes

Signed-off-by: Nok Chan <[email protected]>

* update vresion

Signed-off-by: Nok Chan <[email protected]>

* update release notes

Signed-off-by: Nok Chan <[email protected]>

Signed-off-by: Nok Chan <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Remove comment from code example

Signed-off-by: Ahdra Merali <[email protected]>

* Remove more comments

Signed-off-by: Ahdra Merali <[email protected]>

* Add YAML formatting

Signed-off-by: Ahdra Merali <[email protected]>

* Add missing import

Signed-off-by: Ahdra Merali <[email protected]>

* Remove even more comments

Signed-off-by: Ahdra Merali <[email protected]>

* Remove more even more comments

Signed-off-by: Ahdra Merali <[email protected]>

* Add pickle requirement to extras_require

Signed-off-by: Ahdra Merali <[email protected]>

* Try fix YAML docs

Signed-off-by: Ahdra Merali <[email protected]>

* Try fix YAML docs pt 2

Signed-off-by: Ahdra Merali <[email protected]>

* Fix code snippets in docs (#1876)

* Fix code snippets

Signed-off-by: Ahdra Merali <[email protected]>

* Separate code blocks

Signed-off-by: Ahdra Merali <[email protected]>

* Lint

Signed-off-by: Ahdra Merali <[email protected]>

Signed-off-by: Ahdra Merali <[email protected]>

* Fix issue with specifying format for SparkHiveDataSet (#1857)

Signed-off-by: jstammers <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Update RELEASE.md (#1883)

* Update RELEASE.md

* fix broken link

* Update RELEASE.md

Co-authored-by: Merel Theisen <[email protected]>

Co-authored-by: Merel Theisen <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Deprecate `kedro test` and `kedro lint` (#1873)

* Deprecating `kedro test` and `kedro lint`

Signed-off-by: Nok Chan <[email protected]>

* Deprecate commands

Signed-off-by: Nok Chan <[email protected]>

* Make kedro looks prettier

* Update Linting

Signed-off-by: Nok <[email protected]>

Signed-off-by: Nok Chan <[email protected]>
Signed-off-by: Nok <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Fix micro package pull from PyPI (#1848)

Signed-off-by: Florian Gaudin-Delrieu <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Update Error message for `VersionNotFoundError` to handle Permission related issues better (#1881)

* Update message for VersionNotFoundError

Signed-off-by: Ankita Katiyar <[email protected]>

* Add test for VersionNotFoundError for cloud protocols

* Update test_data_catalog.py

Update NoVersionFoundError test

* minor linting update

* update docs link + styling changes

* Revert "update docs link + styling changes"

This reverts commit 6088e00.

* Update test with styling changes

* Update RELEASE.md

Signed-off-by: ankatiyar <[email protected]>

Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: ankatiyar <[email protected]>
Co-authored-by: Ahdra Merali <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Update experiment tracking documentation with working examples (#1893)

Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Add NHS AI Lab and ReSpo.Vision to companies list (#1878)

Signed-off-by: Ahdra Merali <[email protected]>

* Document how users can use pytest instead of kedro test (#1879)

* Add best_practices.md with introductory sections

Signed-off-by: Jannic Holzer <[email protected]>

* Add pytest and pytest-cov sections

Signed-off-by: Jannic Holzer <[email protected]>

* Add pytest-cov coverage report

Signed-off-by: Jannic Holzer <[email protected]>

* Add sections on pytest-cov

Signed-off-by: Jannic Holzer <[email protected]>

* Add automated_testing to index.rst

Signed-off-by: Jannic Holzer <[email protected]>

* Reformat third-party library names and clean grammar.

Signed-off-by: Jannic Holzer <[email protected]>

* Add link to virtual environment docs

Signed-off-by: Jannic Holzer <[email protected]>

* Add example of good test naming

Signed-off-by: Jannic Holzer <[email protected]>

* Improve link accessibility

Signed-off-by: Jannic Holzer <[email protected]>

* Improve pytest docs link accessibility

Signed-off-by: Jannic Holzer <[email protected]>

* Add reminder link to virtual environment docs

Signed-off-by: Jannic Holzer <[email protected]>

* Fix formatting in link to coverage docs

Signed-off-by: Jannic Holzer <[email protected]>

* Remove reference to /src under 'Run your tests'

Signed-off-by: Jannic Holzer <[email protected]>

* Modify references to <project_name> to <package_name>

Signed-off-by: Jannic Holzer <[email protected]>

* Fix sentence structure

Signed-off-by: Jannic Holzer <[email protected]>

* Fix broken databricks doc link

Signed-off-by: Jannic Holzer <[email protected]>

Signed-off-by: Jannic Holzer <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Capitalise Kedro-Viz in the "Visualize layers" section (#1899)

* Capitalised kedro-viz

Signed-off-by: yash6318 <[email protected]>

* capitalised Kedro viz

Signed-off-by: yash6318 <[email protected]>

* Updated set_up_experiment_tracking.md

Co-authored-by: Deepyaman Datta <[email protected]>
Signed-off-by: yash6318 <[email protected]>

Signed-off-by: yash6318 <[email protected]>
Co-authored-by: Deepyaman Datta <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Fix linting on autmated test page (#1906)

Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Add _SINGLE_PROCESS property to CachedDataSet (#1905)

Signed-off-by: Carla Vieira <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Update the tutorial of "Visualise pipelines" (#1913)

* Change a file extention to match the previous article

Signed-off-by: dinotuku <[email protected]>

* Add a missing import

Signed-off-by: dinotuku <[email protected]>

* Change both preprocessed datasets to parquet files

Signed-off-by: dinotuku <[email protected]>

* Change data type to ParquetDataSet for parquet files

Signed-off-by: dinotuku <[email protected]>

* Add a note for installing seaborn if it is not installed

Signed-off-by: dinotuku <[email protected]>

Signed-off-by: dinotuku <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Document how users can use linting tools instead of `kedro lint` (#1904)

* Add documentation for linting tools

Signed-off-by: Ankita Katiyar <[email protected]>

* Revert changes to commands_reference.md

Signed-off-by: Ankita Katiyar <[email protected]>

* Update linting docs with suggestions

Signed-off-by: Ankita Katiyar <[email protected]>

* Update linting doc

Signed-off-by: Ankita Katiyar <[email protected]>

Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Make core config accessible in dict get way  (#1870)

Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Create dependabot.yml configuration file for version updates (#1862)

* Create dependabot.yml configuration file

* Update dependabot.yml

Signed-off-by: SajidAlamQB <[email protected]>

* add target-branch

Signed-off-by: SajidAlamQB <[email protected]>

* Update dependabot.yml

Signed-off-by: SajidAlamQB <[email protected]>

* limit dependabot to just dependency folder

Signed-off-by: SajidAlamQB <[email protected]>

* Update test_requirements.txt

Signed-off-by: SajidAlamQB <[email protected]>

* Update MANIFEST.in

Signed-off-by: SajidAlamQB <[email protected]>

* fix e2e

Signed-off-by: SajidAlamQB <[email protected]>

* Update continue_config.yml

Signed-off-by: SajidAlamQB <[email protected]>

* Update requirements.txt

Signed-off-by: SajidAlamQB <[email protected]>

* Update requirements.txt

Signed-off-by: SajidAlamQB <[email protected]>

* fix link

Signed-off-by: SajidAlamQB <[email protected]>

* revert

Signed-off-by: SajidAlamQB <[email protected]>

* Delete requirements.txt

Signed-off-by: SajidAlamQB <[email protected]>

Signed-off-by: SajidAlamQB <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Update dependabot config (#1928)

Signed-off-by: Ahdra Merali <[email protected]>

* Update robots.txt (#1929)

Signed-off-by: Ahdra Merali <[email protected]>

* fix broken link (#1950)

Signed-off-by: Ahdra Merali <[email protected]>

* Update dependabot.yml config  (#1938)

* Update dependabot.yml

Signed-off-by: SajidAlamQB <[email protected]>

* pin jupyterlab_services to requirments

Signed-off-by: SajidAlamQB <[email protected]>

* lint

Signed-off-by: SajidAlamQB <[email protected]>

Signed-off-by: SajidAlamQB <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Update setup.py Jinja2 dependencies (#1954)

Signed-off-by: Ahdra Merali <[email protected]>

* Update pip-tools requirement from ~=6.5 to ~=6.9 in /dependency (#1957)

Updates the requirements on [pip-tools](https://github.com/jazzband/pip-tools) to permit the latest version.
- [Release notes](https://github.com/jazzband/pip-tools/releases)
- [Changelog](https://github.com/jazzband/pip-tools/blob/master/CHANGELOG.md)
- [Commits](jazzband/pip-tools@6.5.0...6.9.0)

---
updated-dependencies:
- dependency-name: pip-tools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Ahdra Merali <[email protected]>

* Update toposort requirement from ~=1.5 to ~=1.7 in /dependency (#1956)

Updates the requirements on [toposort]() to permit the latest version.

---
updated-dependencies:
- dependency-name: toposort
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sajid Alam <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Add deprecation warning to package_name argument in session create() (#1953)

Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Remove redundant `resolve_load_version` call (#1911)

* remove a redundant function call

Signed-off-by: Nok Chan <[email protected]>

* Remove redundant resolove_load_version & fix test

Signed-off-by: Nok Chan <[email protected]>

* Fix HoloviewWriter tests with more specific error message pattern & Lint

Signed-off-by: Nok Chan <[email protected]>

* Rename tests

Signed-off-by: Nok Chan <[email protected]>

Signed-off-by: Nok Chan <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>

* Make docstring in test starter match real starters (#1916)

Signed-off-by: Ahdra Merali <[email protected]>

* Try to fix formatting error

Signed-off-by: Merel Theisen <[email protected]>

* Specify pickle import

Signed-off-by: Nok Chan <[email protected]>
Signed-off-by: Ahdra Merali <[email protected]>
Signed-off-by: jstammers <[email protected]>
Signed-off-by: Nok <[email protected]>
Signed-off-by: Florian Gaudin-Delrieu <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: ankatiyar <[email protected]>
Signed-off-by: Merel Theisen <[email protected]>
Signed-off-by: Jannic Holzer <[email protected]>
Signed-off-by: yash6318 <[email protected]>
Signed-off-by: Carla Vieira <[email protected]>
Signed-off-by: dinotuku <[email protected]>
Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: SajidAlamQB <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Nok <[email protected]>
Co-authored-by: Jimmy Stammers <[email protected]>
Co-authored-by: Merel Theisen <[email protected]>
Co-authored-by: Florian Gaudin-Delrieu <[email protected]>
Co-authored-by: Ankita Katiyar <[email protected]>
Co-authored-by: Yetunde Dada <[email protected]>
Co-authored-by: Jannic <[email protected]>
Co-authored-by: Yash Agrawal <[email protected]>
Co-authored-by: Deepyaman Datta <[email protected]>
Co-authored-by: Carla Vieira <[email protected]>
Co-authored-by: Kuan Tung <[email protected]>
Co-authored-by: Sajid Alam <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Merel Theisen <[email protected]>
Co-authored-by: Merel Theisen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Catch error for issues with reading/writing data to object stores (e.g. S3)
4 participants