Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GCP Storage] Upgrade GCSFuse version #1829

Merged
merged 10 commits into from
Apr 11, 2023
Merged

[GCP Storage] Upgrade GCSFuse version #1829

merged 10 commits into from
Apr 11, 2023

Conversation

Michaelvll
Copy link
Collaborator

@Michaelvll Michaelvll commented Mar 31, 2023

We encountered an issue with the gcsfuse that the saved large model checkpoints on a mounted GCS bucket can be incomplete. It could be a problem with the old gcsfuse we were using:
in the release note of gcsfuse (discovered by @infwinston):

Fixed a critical bug in write workflow - although it was very rare, but gcsfuse was writing wrong file content while doing append operation on larger file.

https://github.com/GoogleCloudPlatform/gcsfuse/releases/tag/v0.41.11

Tested (run the relevant ones):

  • Any manual or new tests for this PR (please specify below)
    • Launch a node and write things to the mounted bucket.
    • pytest tests/test_smoke.py::test_aws_storage_mounts --aws; to test the mounting of GCS bucket on AWS VM.
    • pytest tests/test_smoke.py::test_gcp_storage_mounts
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: bash tests/backward_comaptibility_tests.sh

@romilbhardwaj
Copy link
Collaborator

romilbhardwaj commented Mar 31, 2023

Thanks! Should we bump to 0.42.3 (latest) if we're upgrading versions?

@Michaelvll
Copy link
Collaborator Author

Michaelvll commented Apr 1, 2023

Thanks! Should we bump to 0.42.3 (latest) if we're upgrading versions?

It is 0.42.3 in the PR changes?

Weirdly, the launched the VM is still using 0.41.9, still trying to figure out why.

Comment on lines 29 to 31
installed_check = f'[ -x "$(command -v {mount_binary})" ]'
if version is not None:
installed_check += f' && {mount_binary} --version | grep -q {version}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since not all mounting tools may have --version flag, can we change the arg version to version_check_command and the caller can pass whatever version check command they'd like to run?

E.g., in this case, the caller would pass version_check_command = 'gcsfuse --version | grep -q 1.0'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Fixed. Thanks!

Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the version check! Should be good to go if pytest tests/test_smoke.py::test_gcp_storage_mounts passes.

@Michaelvll Michaelvll merged commit 9f3c05f into master Apr 11, 2023
@Michaelvll Michaelvll deleted the update-gcsfuse branch April 11, 2023 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants