-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel relabel #1137
Parallel relabel #1137
Conversation
src/libpriv/rpmostree-core.c
Outdated
@@ -2586,6 +2447,7 @@ relabel_in_thread_impl (RpmOstreeContext *self, | |||
NULL, NULL, NULL); | |||
|
|||
ostree_repo_commit_modifier_set_devino_cache (modifier, cache); | |||
ostree_repo_commit_modifier_set_sepolicy (modifier, self->sepolicy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer your question, I can't think of a good reason right now why it wasn't done this way. Definitely ostreedev/ostree#1165 would've been a blocker, though clearly it could've been fixed back then too. Anyway, this is definitely better!
Ouch, I know that pain! :(
For Python at least, it's really easy to do multiprocessing using the module. I make use of it in the PAPR Python rewrite (which I really should pick up again). But yeah, clearly it sucks if you need a lot of coordination. |
☔ The latest upstream changes (presumably 51c5591) made this pull request unmergeable. Please resolve the merge conflicts. |
7c95875
to
2586e39
Compare
Rebased 🏄♂️ and lifting WIP. |
GCancellable *cancellable, | ||
GError **error) | ||
{ | ||
if (g_cancellable_set_error_if_cancelled (cancellable, error)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we also need this in import_in_thread
? Otherwise, we won't even stop new imports, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check the cancellable each iteration of reading the archive, and GTask
also checks the cancellable before actually running the thread; see g_task_start_task_thread()
, so it's probably simplest to just drop the one in importing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm no the GTask
one is just if one uses g_task_set_return_on_cancel()
, let's be conservative.
2586e39
to
a060f4e
Compare
src/libpriv/rpmostree-core.c
Outdated
@@ -2093,6 +2095,7 @@ rpmostree_context_import_jigdo (RpmOstreeContext *self, | |||
|
|||
self->async_dnfstate = hifstate; | |||
self->async_running = TRUE; | |||
self->async_cancellable = g_cancellable_new (); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm still not clear on how this works exactly. We create a new GCancellable
here, but when do we actually consult it? We're still passing our own cancellable
to rpmostree_importer_run_async
so the g_cancellable_set_error_if_cancelled
importer fixup you added is not actually using this, right? So, aren't we still going to end up trying to import all the pkgs even if the very first one failed? Or should we be passing self->async_cancellable
to rpmostree_importer_run_async
down below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No you're totally right, it doesn't work 😉. I was throwing code at the wall here a bit. Let's do the obvious thing here and reuse the source cancellable, which is what ostree-pull does too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed a fixup ⬇️ which I tested by adding:
if (g_random_boolean ())
return glnx_throw (...)
in the relabel_in_thread()
code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it looks like we're not preserving the original error correctly:
# + ssh -o User=root -o ControlMaster=auto -o ControlPath=/var/tmp/ssh-vmcheck2-1513030065068962438.sock -o ControlPersist=yes vmcheck2 env ASAN_OPTIONS=detect_leaks=false rpm-ostree install test-opt-1.0
# error: Operation was cancelled
+ fatal 'File '\''err.txt'\'' doesn'\''t match regexp '\''See https://github.com/projectatomic/rpm-ostree/issues/233'\'''
+ echo File ''\''err.txt'\''' 'doesn'\''t' match regexp ''\''See' 'https://github.com/projectatomic/rpm-ostree/issues/233'\'''
File 'err.txt' doesn't match regexp 'See https://github.com/projectatomic/rpm-ostree/issues/233'
src/libpriv/rpmostree-core.c
Outdated
@@ -2095,7 +2096,7 @@ rpmostree_context_import_jigdo (RpmOstreeContext *self, | |||
|
|||
self->async_dnfstate = hifstate; | |||
self->async_running = TRUE; | |||
self->async_cancellable = g_cancellable_new (); | |||
self->async_cancellable = cancellable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh OK, that makes more sense now! :)
To exit earlier if we've been cancelled. Came up in review for parallel relabeling.
I believe this is a leftover vestige, and it was adding confusion when I was debugging `rpmostree-core.c` async ops and cancellation. Now the only cancellables in the daemon are created by transaction ops.
Basically since we're doing internal async ops which set the cancellable on failure, we still want the first error to win since it'll be more useful. See the docs for `g_task_set_check_cancellable()` for more.
This is another big task just like importing that greatly benefits from being parallel. While here I hit the issue that on error we didn't wait for pending async tasks to complete; I changed things for importing so that we do that, and used it here too. This was almost straightforward except I spent a *lot* of time debugging what turned out to be calling `dnf_package_get_nevra()` in the worker threads 😢. I'm mostly writing this to speed up unified core/jigdo, but it's also obviously relevant on the client side.
3f9595a
to
f7e9992
Compare
Yep, added a prep commit to fix that ⬆️ |
⚡ Test exempted: merge already tested. |
I believe this is a leftover vestige, and it was adding confusion when I was debugging `rpmostree-core.c` async ops and cancellation. Now the only cancellables in the daemon are created by transaction ops. Closes: #1137 Approved by: jlebon
Basically since we're doing internal async ops which set the cancellable on failure, we still want the first error to win since it'll be more useful. See the docs for `g_task_set_check_cancellable()` for more. Closes: #1137 Approved by: jlebon
This is another big task just like importing that greatly benefits from being parallel. While here I hit the issue that on error we didn't wait for pending async tasks to complete; I changed things for importing so that we do that, and used it here too. This was almost straightforward except I spent a *lot* of time debugging what turned out to be calling `dnf_package_get_nevra()` in the worker threads 😢. I'm mostly writing this to speed up unified core/jigdo, but it's also obviously relevant on the client side. Closes: #1137 Approved by: jlebon
Basically the `rpmostree_context_relabel()` call we had in the treecompose path for unified core didn't actually have any effect as the core code did a relabel and unset the array. I think this may actually be a regression from: coreos#1137 though I didn't verify. Anyways looking at this, the code is a lot simpler if we change the API so that the "normal" relabeling is folded into `rpmostree_context_assemble()`. Then we change the public relabel API to be "force relabel" which we use in the unified core 🌐 treecompose path. This shrinks the jigdoRPM for FAH from 90MB to 68MB. Closes: coreos#1172
Basically the `rpmostree_context_relabel()` call we had in the treecompose path for unified core didn't actually have any effect as the core code did a relabel and unset the array. I think this may actually be a regression from: coreos#1137 though I didn't verify. Anyways looking at this, the code is a lot simpler if we change the API so that the "normal" relabeling is folded into `rpmostree_context_assemble()`. Then we change the public relabel API to be "force relabel" which we use in the unified core 🌐 treecompose path. This shrinks the jigdoRPM for FAH from 90MB to 68MB. Closes: coreos#1172
Basically the `rpmostree_context_relabel()` call we had in the treecompose path for unified core didn't actually have any effect as the core code did a relabel and unset the array. I think this may actually be a regression from: coreos#1137 though I didn't verify. Anyways looking at this, the code is a lot simpler if we change the API so that the "normal" relabeling is folded into `rpmostree_context_assemble()`. Then we change the public relabel API to be "force relabel" which we use in the unified core 🌐 treecompose path. This shrinks the jigdoRPM for FAH from 90MB to 68MB. Closes: coreos#1172
Basically the `rpmostree_context_relabel()` call we had in the treecompose path for unified core didn't actually have any effect as the core code did a relabel and unset the array. I think this may actually be a regression from: #1137 though I didn't verify. Anyways looking at this, the code is a lot simpler if we change the API so that the "normal" relabeling is folded into `rpmostree_context_assemble()`. Then we change the public relabel API to be "force relabel" which we use in the unified core 🌐 treecompose path. This shrinks the jigdoRPM for FAH from 90MB to 68MB. Closes: #1172 Closes: #1173 Approved by: jlebon
Basically the `rpmostree_context_relabel()` call we had in the treecompose path for unified core didn't actually have any effect as the core code did a relabel and unset the array. I think this may actually be a regression from: #1137 though I didn't verify. Anyways looking at this, the code is a lot simpler if we change the API so that the "normal" relabeling is folded into `rpmostree_context_assemble()`. Then we change the public relabel API to be "force relabel" which we use in the unified core 🌐 treecompose path. This shrinks the jigdoRPM for FAH from 90MB to 68MB. Closes: #1172 Closes: #1173 Approved by: jlebon
Basically the `rpmostree_context_relabel()` call we had in the treecompose path for unified core didn't actually have any effect as the core code did a relabel and unset the array. I think this may actually be a regression from: #1137 though I didn't verify. Anyways looking at this, the code is a lot simpler if we change the API so that the "normal" relabeling is folded into `rpmostree_context_assemble()`. Then we change the public relabel API to be "force relabel" which we use in the unified core 🌐 treecompose path. This shrinks the jigdoRPM for FAH from 90MB to 68MB. Closes: #1172 Closes: #1173 Approved by: jlebon
On top of #1124