Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom error messages for IO with nonexistent files #14662

Merged

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Dec 20, 2023

Description

Closes #12311; closes #9564
We return a somewhat cryptic error when opening a file that does not exist: "Cannot query file size".

With this change, we report whether the file exists, or, if the file does exist, what the errno value is after open.
Also added a check for the output files' directory in file_sink. This check is now also included in file_wrapper, just in case initialization order changes at some point.
Now we should always correctly report missing output file directory and missing input files.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vuule vuule self-assigned this Dec 20, 2023
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Dec 20, 2023
@vuule vuule added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Dec 21, 2023
@vuule vuule changed the title Custom error message when trying to read a file that does not exist Custom error messages for IO with nonexistent files Dec 26, 2023
@vuule vuule marked this pull request as ready for review December 27, 2023 17:43
@vuule vuule requested a review from a team as a code owner December 27, 2023 17:43
CUDF_FAIL("Cannot open output file; directory does not exist");
}
CUDF_FAIL("Cannot open output file");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it help to make this code also similar to cpp/src/io/utilities/file_io_utilities.cpp ? That seems more detailed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added errno. I originally excluded it because I didn't know if ofstream::open sets it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move this into file_io_utilities? As it should contain all file IO.

Copy link
Contributor

@mythrocks mythrocks Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL std::strerror.

I had to look up where the allocation happens for the string returned by std::strerror(). I'm not convinced it is threadsafe, given that it's allowed to reuse the same allocated space to return strings across threads.

Should we consider switching this to std::strerror_s()?

Edit: Source.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch! I'll look into strerror_s.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move this into file_io_utilities?

I don't think file_sink belongs in utilities, but I'll try to reuse this error checking and place the common code in file_io_utilities.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have just put a mutex in here. strerror_s portability is horrendous.
Our compiler does not support it, so I have to use strerror_r. Aaand, this one is problematic as well, see https://www.club.cc.cmu.edu/~cmccabe/blog_strerror.html

Copy link
Contributor

@mythrocks mythrocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, save for the potential thread-safety issue.

@vuule vuule requested a review from ttnghia January 4, 2024 22:48
Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

TIL [[noreturn]]

@vuule
Copy link
Contributor Author

vuule commented Jan 5, 2024

/merge

@rapids-bot rapids-bot bot merged commit 0c98134 into rapidsai:branch-24.02 Jan 5, 2024
70 checks passed
@vuule vuule deleted the impr-failed-file-open-error-msg branch January 5, 2024 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
4 participants