-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 'disable_lockfile' configuration to disable locking globally #1830
Conversation
Hi @danpf thanks for the PR! The lockfile is needed to store meta information about the running process in a folder on the computer, yeah. How about we make a "disable_filelock: true/false" option in |
Such a global flag/option "disable_filelock: true/false" in |
@danpf this looks almost perfect. |
@wolfv oh hey ya sorry you're too quick for me I was going to ask if that was the best way to handle this or if we should be special casing in places where locks are created. Juggling a few things but I'll do that soon. |
Assuming tests pass this should be good. I left the micromamba run .json "lockfiles" (ScopedProcFile ctor) in as they are just created with ofstream and I figure that is pretty unlikely to fail. but if you think they should be removed as well I can do that too (just 1 line change) (see run.cpp) |
@wolfv Let me know if you think this PR needs anything else! |
libmamba/src/api/configuration.cpp
Outdated
.long_description(unindent(R"( | ||
Don't create a lockfile when running mamba commands. Use with caution! | ||
-- Some filesystems don't support File Locking and locks don't always | ||
make sense - like when on an HPC. Default is False (use a lockfile)"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add an example of what problems may arise when not using locking? What does Mamba use locks for that will now be the responsibility of the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what commands are affected by this? If it's only a small selection of commands then it might be worthwhile explicitly listing them here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on how much you want to write on this topic you can also extend the new troubleshooting page (#1873) and link to that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I will update this once we decide on the name to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use lockfiles mainly to lock the package & repodata cache (bad things happen if two processes write to the same file...
We also use it in micromamba run
to write some json files to store data about the process in ~/.mamba/proc/...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also use it in
micromamba run
to write some json files to store data about the process in~/.mamba/proc/...
Short question, I have noticed that I can't have more than 1 micromamba run ...
invocation running at the same time, spawning multiple ones fail with critical libmamba Another process with name '...' is currently running.
- Similarly if a micromamba run ...
process get's SIGKILLed, any micromamba run ...
afterwards fails with the same error until the zombie ~/.mamba/proc/*.json
get's cleaned up.
Will this PR fix these issues or should I open an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vicecea I think this might be confusing some issues. Running two run
commands with the same label
is disallowed. I don't think this PR will fix that. If you can open a new issue and describe exactly what is going on, that would be nice!
libmamba/src/core/util.cpp
Outdated
@@ -1133,7 +1137,7 @@ namespace mamba | |||
try | |||
{ | |||
// Don't even log if the file/directory isn't writable by someone or doesnt exists. | |||
if (path::is_writable(path)) | |||
if (Context::instance().disable_lockfile || path::is_writable(path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either I lack context or this is wrong, shouldn't it be !disable_lockfile && is_writable
? If so, this lacks a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To revert this change you would need to modify this function:
https://github.com/danpf/mamba/blob/danpf-nolock/micromamba/src/run.cpp#L105 to not fail if nullptr is returned.
I also think it should not matter if path::is_writable
or not if we are not making a lockfile anyway so I think this is correct.
EDIT:
you will also have to modify the assert in ScopedProcFile
If you would like me to modify those functions and remove this change, let me know - it is your choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the logic is now confusing (although I think I get it). Maybe we can return earlier when the lockfile is disable? Or do we even need that check here?
@@ -197,6 +197,7 @@ namespace mamba | |||
bool no_env = false; | |||
|
|||
std::size_t lock_timeout = 0; | |||
bool disable_lockfile = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should negative this boolean (and option) like here
bool compile_pyc = true; |
mamba/micromamba/src/common_options.cpp
Lines 341 to 343 in e0527eb
auto& compile_pyc = config.at("compile_pyc"); | |
subcom->add_flag( | |
"--pyc,!--no-pyc", compile_pyc.get_cli_config<bool>(), compile_pyc.description()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wolfv WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that is what you want I will change it. It's your choice.
micromamba/src/run.cpp
Outdated
@@ -161,6 +161,8 @@ namespace mamba | |||
std::unique_ptr<LockFile> proc_dir_lock = lock_proc_dir()) | |||
: location{ proc_dir() / fmt::format("{}.json", getpid()) } | |||
{ | |||
if (Context::instance().disable_lockfile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code modified here does not do what you think. IIUC this creates a "pid file" that is used to give a list of running processes (see ~/.mamba/proc/
and micromamba ps
)
If CI is green with your changed then this also looks like missing test coverage :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree and I originally did not have this line, however on my older hpc I was still getting the error:
mamba run' failed to open/create file...
I cannot explain this, but when I disable the json pid fifles from being created - everything works great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danpf is the problem here with the lockfile or generally with writing to the home directory?
We do in fact lock the PID directory while the process is writing out the json file, so it's possible that it's also a lockfile issue.
Would it make more sense to put these checks in the lock_proc_dir()
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have looked a bit and I think this is somewhat unrelated to the lockfile problem. Probably the user cannot create a new directory / new files under the ~/.mamba/proc
directory. Could that be the case? We should handle that more gracefully, but I think it should go in a seperate PR.
Co-authored-by: Jonas Haag <[email protected]>
Co-authored-by: Jonas Haag <[email protected]>
Hi @danpf sorry for taking so long to look at this PR. I'd be super happy to merge this. We also have some additional fixes coming for lockfiles (#1193) (there were some logic issues with the assumption that PIDs are globally unique). With those two PRs in, the next mamba release should be much better in terms of file-locks! :) |
libmamba/src/core/util.cpp
Outdated
@@ -751,6 +751,9 @@ namespace mamba | |||
, m_timeout(timeout) | |||
, m_locked(false) | |||
{ | |||
if (!Context::instance().use_lockfiles) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mixes expected failures (because of use_lockfiles
) with unexpected ones (exceptions).I think this check should be removed and use before trying to instantiate a LockFile.
return ptr; | ||
} | ||
auto ptr = std::make_unique<LockFile>(path); | ||
return ptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function mixes again unexpected failure with expected ones, requiring a duplicated check of use_lockfiles
before calling it to disambiguate between eexpected failures and unexpected ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A way to fix this is to make the constructor private, and have static functions instantiating lockfiles, and checking for the context variable. If the lock fails for any other reason, they should propagate the expection. Or use cpp_expected, and provide the reason for the failure (this way we can mix expected and unexpected, but we don't need duplicated checks).
micromamba/src/run.cpp
Outdated
@@ -104,6 +104,9 @@ namespace mamba | |||
|
|||
std::unique_ptr<LockFile> lock_proc_dir() | |||
{ | |||
if (!Context::instance().use_lockfiles) | |||
return nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check should not be required. The problem is that try_lock
returns a nullptr if the lock fails, whatever the failure is expected or not.
889f2df
to
bbe0b26
Compare
Finally got a chance to check this out and I see that it's already done! Just wanted to say thank you to everyone involved for finishing this PR! |
…ly (mamba-org#1830) Co-authored-by: Danny <[email protected]> Co-authored-by: Jonas Haag <[email protected]> Co-authored-by: Wolf Vollprecht <[email protected]> Co-authored-by: Johan Mabille <[email protected]>
The point of this PR is to add a flag to
micromamba run
that will disable the lockfile creation before running the command.Reasoning
Dealing with unique filesystem problems is unfortunately all too common in my life and I have recently encountered issues with my parallel commands not running/erroring out due to my server's inability to create many lockfiles at a time. It appears that I am not the only one that has to deal with this #1446.
Looking at the code it seems to be a relatively simple change and is easily tested, but let me know if there's something I should change. I suppose an alternative to this would be #1448 but I think this still has its utility in cases when you know a lockfile is not necessary (like on HPCs.)
Also -- a genuine question -- I understand the reasoning behind using a lockfile when doing install/remove/etc, but I don't understand why a lock is necessary when using run (I assume that when using run the most common use-case is to run a binary in a separate environment) -- Was the command added to help with
micromamba ps
or were there other scenarios that were envisioned when locking was added torun
?Cheers,
~Danny