Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows support #640

Merged
merged 42 commits into from
Oct 24, 2023
Merged

Windows support #640

merged 42 commits into from
Oct 24, 2023

Conversation

nkaretnikov
Copy link
Contributor

@nkaretnikov nkaretnikov commented Oct 22, 2023

See #507. Original PR: #559.

The main.js file from conda-store-ui contains an emoji character that cannot
be encoded correctly unless the encoding is set to utf-8, which is not always
the default on Windows.
Windows uses a different type of permissions system than Unix. For now, using
None will cause action_set_conda_prefix_permissions to do nothing.
1000 hits the sqlite3 max expression depth and results in an error.
This disables --beat on Windows and sets FORKED_BY_MULTIPROCESSING=1, which is
required to make the default celery pool work on Windows.
os.path.join is incorrect for constructing a URL on Windows because URLs
should always use forward slashes regardless of platform.
The cookie length was incorrectly using .seconds instead of .total_seconds().
This was only a problem on Windows because Windows has a ~16 ms precision for
timedeltas, meaning the .seconds attribute for the difference was sometimes
being set to 0.
This is needed for uses with --json, because otherwise any stderr breaks the
json.
@nkaretnikov
Copy link
Contributor Author

nkaretnikov commented Oct 22, 2023

Note: Windows test takes 25 min. macOS: 14 min. Ubuntu: 10 min.

I wonder why that is. E.g., the post set up env on Windows takes 3 mins, while it takes 4 secs on macOS. Is this disk/fs perf? Other phases are also slower by 2 mins on average.

test_file2.write_text("b"*1000)
# Test hard links
test_file_hardlink = test_dir / "test_file_hardlink"
test_file_hardlink.hardlink_to(test_file)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewers: this is tested where test_dir size is tested as a whole.

encoding="utf-8",
)
self.stdout.write(result.stdout)
if not redirect_stderr:
self.stderr.write(result.stderr)
return result
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #559 (comment) for context on redirects.

max_age=(authentication_token.exp - datetime.datetime.utcnow()).seconds,
max_age=int(
(authentication_token.exp - datetime.datetime.utcnow()).total_seconds()
),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fun bug 😄 glad we got to the bottom of this one.

@@ -73,6 +73,10 @@ conda-unpack

### Docker Registry

```{note}
Copy link
Contributor Author

@nkaretnikov nkaretnikov Oct 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if {note} syntax is supported. It seems to be ignored if not, so seems harmless.

@nkaretnikov
Copy link
Contributor Author

Manually tested this env via the admin UI:

channels:
- conda-forge
dependencies:
- python
- pip:
  - nothing
- ipykernel
- pytest
- requests
description: ''
name: test-env
prefix: null
variables: null

Building, rebuilding, and deletion work.

@nkaretnikov nkaretnikov marked this pull request as ready for review October 22, 2023 22:25
@nkaretnikov nkaretnikov requested a review from costrouc October 22, 2023 22:25
@nkaretnikov
Copy link
Contributor Author

@costrouc This is Aaron's Windows PR with some minor changes from me. PTAL and merge if OK.

@jaimergp
Copy link
Member

the post set up env on Windows takes 3 mins

You can do run-post: false (or similar name) on setup-miniconda. It's an ephemeral VM after all.

Copy link
Member

@costrouc costrouc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall these changes do not look controversial and I'm surprised that the windows fixes were actually smaller than I expected. I was really expecting issues with pathlib etc but it looks like most of those bugs didn't exist.

@@ -513,10 +513,11 @@ def update_packages(self, db, subdirs=None):
package_builds[package_key].append(new_package_build_dict)
logger.info("CondaPackageBuild objects created")

batch_size = 1000
# sqlite3 has a max expression depth of 1000
batch_size = 990
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet there is an interesting story here? Is this only something that showed up on Windows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen this personally, but it's mentioned in one of the Aaron's commits:

Lower the batch size for updating packages from 1000 to 990
1000 hits the sqlite3 max expression depth and results in an error.

Seems harmless, so I decided to keep it.

@@ -302,21 +302,24 @@ def _default_celery_results_backend(self):
)

default_uid = Integer(
os.getuid(),
None if sys.platform == "win32" else os.getuid(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see sys.platform == "win32" in many places. Should we have a constant that we create for this? Since much of our code only has a special path for Windows?

Copy link
Member

@costrouc costrouc Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More I think about it I don't think this is necessary to have a constant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a constant in Aaron's original code. Well, two constants, in different files. Since you need to share this between multiple files, you'd have to put it in __init__ to avoid import issues. But we don't do this for any other platforms, so I decided to simply use sys.platform, as everywhere else, to avoid introducing new entities.

openapi_url=os.path.join(self.url_prefix, "openapi.json"),
docs_url=os.path.join(self.url_prefix, "docs"),
redoc_url=os.path.join(self.url_prefix, "redoc"),
openapi_url=posixpath.join(self.url_prefix, "openapi.json"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll admit using os.path... for urls is a hack and I should have used something like yarl for example. Thoughts? Since url paths are not the same as file paths? Maybe posixpath is the right thing to use here. It certainly seems better than os.path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate entity for URLs would be better, yes. But I'm not going to block this PR just because of this issue, since everything seems to work fine.

max_age=(authentication_token.exp - datetime.datetime.utcnow()).seconds,
max_age=int(
(authentication_token.exp - datetime.datetime.utcnow()).total_seconds()
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fun bug 😄 glad we got to the bottom of this one.

cmd = ["du", "-sb", str(path)]
else:
return str(du(path))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to see the fallback is this approach 👍 If they are using cygwin or https://learn.microsoft.com/en-us/sysinternals/downloads/du it may still be possible they have du.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've already looked at du from sysinternals. Besides being a separate tool, it also produces slightly different results. And the output format is also different. As for cygwin, the context for this work is to be able to use native Windows tooling.

output = subprocess.check_output(cmd, encoding="utf-8").split()[0]
if sys.platform == "darwin":
# mac du does not have the -b option to return bytes
output = str(int(output) * 512)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 512? This is a weird size since it is a half a kb.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment from a discussion on the original PR: "Note that the numbers will be off a little bit (by less than 512) on Mac because du rounds up to the nearest multiple of 512 when converting bytes to blocks."

I'm still not certain that macOS code is correct here, but it does pass this new test (where we check all three platforms), so I'm going to keep this and will take a look later.

]

# The default Celery pool requires this on Windows. See
# https://stackoverflow.com/questions/37255548/how-to-run-celery-on-windows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great glad this fix was minimal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still not clear whether it means that Celery works as intended, since it's not officially supported on Windows. But it does run, so we'll just use this for now.

val = disk_usage(test_dir)
assert isinstance(val, str)
assert 2000 + dir_size <= int(val) <= 2700 + dir_size
assert 2000 + dir_size <= du(test_dir) <= 2700 + dir_size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exact values depend on the block size and other details

Oh yeah this is a fun detail 😄 guess you can't be smaller than the blocksize and also can only take up an integer number of block sizes.

@nkaretnikov nkaretnikov merged commit 9018694 into conda-incubator:main Oct 24, 2023
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done 💪🏾
Development

Successfully merging this pull request may close these issues.

4 participants