Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve util/file for Windows #6221

Merged
merged 21 commits into from
Feb 15, 2023
Merged

Improve util/file for Windows #6221

merged 21 commits into from
Feb 15, 2023

Conversation

fealebenpae
Copy link
Member

@fealebenpae fealebenpae commented Jan 21, 2023

What, How & Why?

While looking at the Emscripten filesystem implementation I saw that there are a bunch of Windows-specific code paths in our file handling that can be improved. Since on Windows we always have std::filesystem available it's better to replace the Win32-specific implementations with ones based on std::filesystem. That way instead of a mix of Win32 and C APIs that require manual conversion between char and wchar strings for paths we can use the simpler std::filesystem implementation.

This also implements File::get_unique_id for Windows and refactors existing APIs and tests to use that instead of doing it manually.

☑️ ToDos

  • 📝 Changelog update
  • 🚦 Tests (or not relevant)
  • C-API, if public C++ API changed.

Comment on lines -1424 to -1467
#ifdef _WIN32
// can't rename to existing file on Windows
util::File::copy(tmp_path, m_db_path);
util::File::remove(tmp_path);
#else
util::File::move(tmp_path, m_db_path);
#endif
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File::move() is now implemented in terms of std::filesystem::rename(), which does support overwriting renames on Windows.

@fealebenpae fealebenpae marked this pull request as ready for review January 21, 2023 18:53
Copy link
Contributor

@jedelbo jedelbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 31 to 35
#ifndef _WIN32
#include <dirent.h> // POSIX.1-2001
#else
#include <Windows.h>
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: would you mind switching to an #ifdef and flipping the cases? I have a pet peeve for negated ifs with an else case, since it is basically saying "if non not windows".

@@ -1359,7 +1357,8 @@ inline File::Exists::Exists(const std::string& msg, const std::string& path)
inline bool operator==(const File::UniqueID& lhs, const File::UniqueID& rhs)
{
#ifdef _WIN32 // Windows version
throw util::runtime_error("Not yet supported");
return lhs.id_info.VolumeSerialNumber == rhs.id_info.VolumeSerialNumber &&
memcmp(&lhs.id_info.FileId, &rhs.id_info.FileId, sizeof(FILE_ID_128)) == 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
memcmp(&lhs.id_info.FileId, &rhs.id_info.FileId, sizeof(FILE_ID_128)) == 0;
memcmp(&lhs.id_info.FileId, &rhs.id_info.FileId, sizeof(lhs.id_info.FileId)) == 0;

It is usually preferable to use one of the expressions in the memfoo sizeof rather than a fixed type, that way you are guaranteed to use the right type. That is unless the expressions have some weird incorrect type and you actually do need to explicitly specify a different type.

@@ -1373,7 +1372,10 @@ inline bool operator!=(const File::UniqueID& lhs, const File::UniqueID& rhs)
inline bool operator<(const File::UniqueID& lhs, const File::UniqueID& rhs)
{
#ifdef _WIN32 // Windows version
throw util::runtime_error("Not yet supported");
return lhs.id_info.VolumeSerialNumber < rhs.id_info.VolumeSerialNumber &&
std::lexicographical_compare(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid std::lexicographical_compare and use memcmp() < 0 instead. You could also make std::string_views and use < on them if you would prefer. One of the problems is that std::lexicographical_compare uses a different, incorrect, and slower ordering of char.

Comment on lines 2190 to 2191
REQUIRE(util::File::get_unique_id(config.path, id));
return id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this just be return util::File::get_unique_id(config.path)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an overload of UniqueID File::get_unique_id(std::string).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I misread that file. It would be nice to have though 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But totally fine if you don't want to do it right now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about transforming bool get_unique_id(const std::string&, UniqueID&) into std::optional<UniqueID> get_unique_id(const std::string& path)? I think that would make using it less clumsy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Comment on lines 242 to 245
if (error == errc::permission_denied || error == errc::read_only_file_system ||
error == errc::device_or_resource_busy || error == errc::operation_not_permitted ||
error == errc::file_exists || error == errc::directory_not_empty) {
throw File::PermissionDenied(error.message(), path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels weird that we have different definitions of "permission denied" here than on line 158. It seems like we should be able to have a throwFileExceptionIfFailed(std::error_code) function to handle this. If you do that, it could replace the if (!error) on line 237 as well, so you could just do throwIfFailed(error); return result;

Edit: I see that there are a ton of blocks like this. Unless they really need different checks, please pull this out to a function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point. I've refactored these blocks out to a helper function.

@@ -1564,7 +1569,19 @@ FileDesc File::get_descriptor() const
bool File::get_unique_id(const std::string& path, File::UniqueID& uid)
{
#ifdef _WIN32 // Windows version
throw std::runtime_error("Not yet supported");
WindowsFileHandleHolder fileHandle(::CreateFile2(std::filesystem::path(path).c_str(), FILE_READ_ATTRIBUTES,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, why would windows have you call a function named CreateFile if it don't want to actually create a file. Would you mind adding a comment like // Despite the name, this call won't actually create a file if it doesn't exist yet.

return path;

return (std::filesystem::path(base_dir) / path_).string();
#elif !defined(_WIN32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this always true now, so that the final #else is dead code?

Comment on lines 1627 to 1628
if (path_.is_absolute())
return path;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this included in the behavior of path's operator<? If this is different and needed for some reason, a comment explaining why would be nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. According to https://github.com/microsoft/STL/blob/73924c1920af92899f7582cd904ea819b9db35bc/stl/inc/filesystem#L708-L720 that's indeed the case. Side note: I love that Microsoft's STL is open source now and I can get useful information about its workings like this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I just looked it up at https://en.cppreference.com/w/cpp/filesystem/path/append (operator/ docs just say to see here for behavior)

Copy link
Contributor

@RedBeard0531 RedBeard0531 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot to publish comments after doing the review!

@fealebenpae fealebenpae merged commit 18c99b7 into master Feb 15, 2023
@fealebenpae fealebenpae deleted the yg/win32-file-uniqueid branch February 15, 2023 13:02
@kiburtse kiburtse mentioned this pull request Mar 3, 2023
3 tasks
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants