Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rados: Make "operation" code comply to pointer passing rules #597

Merged
merged 2 commits into from
Nov 15, 2021

Conversation

ansiwen
Copy link
Collaborator

@ansiwen ansiwen commented Oct 11, 2021

In #581 we discovered, that we are actually violating the pointer passing rules in some places. For the write buffer of the writeSteps this change uses a PtrGuard, while the return parameters of GetOmapSteps are allocated in C memory.

Related: #584

@ansiwen ansiwen added the no-API This PR does not include any changes to the public API of a go-ceph package label Oct 11, 2021
@ansiwen ansiwen force-pushed the pr/ansiwen/fix-pointer-passing branch from 1973cc7 to 141421d Compare October 11, 2021 14:39
Copy link
Collaborator

@phlogistonjohn phlogistonjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only real specific request is to choose a name that's clear for the current "Poke."

I also wonder if we're going to make a direct call to pointer guard from library code if we should have a bit of "soak time" by merging it after we release rather than immediately before release.

Most everything else seems ok to me.

Comment on lines 50 to 51
// Poke stores the pinned Go pointer in C memory at cPtr
func (v *PtrGuard) Poke(cPtr CPtr) *PtrGuard {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Poke stores the pinned Go pointer in C memory at cPtr
func (v *PtrGuard) Poke(cPtr CPtr) *PtrGuard {
// Store the pinned Go pointer into the C memory located at cPtr.
func (v *PtrGuard) Store(cPtr CPtr) *PtrGuard {

I think the above naming would be clearer. I'm not sure if Poke is meant to be a reference to Peek/Poke - but even that seems kind of obscure to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was the idea (Peek/Poke). But Store is also fine for me.

return &writeStep{
b: b,
cBuffer: (*C.char)(unsafe.Pointer(&b[0])),
pg: cutil.NewPtrGuard(bufPtr),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this our first hard dependency on ptrgourd? I think that all previous cases were wrapped so that they either copied data or used ptrguard, right?

If so, I'm ok with it in theory but I'm just a tad hesitant to merge it right before we release. Thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right, I forgot about our ptrguard tag. I mean, we test it, but it's true, that it's not used in the "field" (that is, CSI) yet. Let's not merge then for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed this comment. Does that mean PRs that would depend on this fix (e.g. #581) are blocked until after December's release?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the PRs can be merged prior to release. We simply need to discuss our options, some of the work can possibly be split up, for example. In fact I'd prefer to merge as much as possible sooner rather than later Unfortunately, things have been slow due to illnesses and time off. Please continue to ping us on these items so that we don't forget to work on them!

@ansiwen ansiwen force-pushed the pr/ansiwen/fix-pointer-passing branch from 141421d to 4162b0a Compare October 11, 2021 15:27
@phlogistonjohn
Copy link
Collaborator

Following up on this: do you prefer to:
a) take this as-is - adding one new hard dependency on prtguard
b) add some wrapping around this change to make prtguard a build-time toggle, like it's other uses
c) enable prtguard generally, including a new hard dependency on prtguard
d) enable prtguard generally, including a build-time toggle for the new usage

Note: I don't think these all need to be done in this pr, just looking for a general idea where you'd like to go with this. I have a preference but I'm not going to tell you (yet). :-)

@ansiwen
Copy link
Collaborator Author

ansiwen commented Oct 19, 2021

@phlogistonjohn Ok, how about this: we switch to ptrguard by default and wait one release cycle, if someone complains. This PR I will split up so that without ptrguard tag the old ("insecure") code is used. I'm not in the mood to build another "safe" shim layer that we will remove after one release again. ;-)

@phlogistonjohn
Copy link
Collaborator

That sounds OK to me, but I'm a bit unclear on "wait one release cycle, if someone complains". If we enable prtguard by default (soonish) we'll have the remaining october/november period of pre-release "soak time" before the next release in Dec.
Then unless we find an issue it would be enabled by default in the released version. At that point it'd be the default in all released versions unless we get reports [1]. But I don't quite get the "one release cycle" aspect from your comment.

[1] - what we do if we get reports is contingent on the severity of the report and how many releases it's been enabled IMO

@ansiwen
Copy link
Collaborator Author

ansiwen commented Nov 9, 2021

Sorry, I wasn't clear enough. What I meant was: for the next release we will switch to ptrguard by default, but keep the other code. If until the following release no one complains, we can remove the non-ptrguard code.

@ansiwen
Copy link
Collaborator Author

ansiwen commented Nov 9, 2021

Actually, now that I know a lot more about that whole topic, and given that the pinning API has been accepted, I would address the whole thing differently now: we can assume, that the pinning API will land before the GC becomes "moving". So before the API is there, ptrguard doesn't have to do anything. And as soon as the API is there, it can use it. The only use-case for the current implementation would be, that older versions of go-ceph would still work with some future version of go that might have a moving GC.

@phlogistonjohn
Copy link
Collaborator

Actually, now that I know a lot more about that whole topic, and given that the pinning API has been accepted, I would address the whole thing differently now: we can assume, that the pinning API will land before the GC becomes "moving". So before the API is there, ptrguard doesn't have to do anything. And as soon as the API is there, it can use it. The only use-case for the current implementation would be, that older versions of go-ceph would still work with some future version of go that might have a moving GC.

Sure, I agree that it wouldn't make a lot of sense to worry about moving gc a lot given the above. To make things more "actionable" what should we do over the next couple of weeks? Break this PR up into parts, one for the ptrguard stuff and one for the other changes? I also want to unblock PR #581.

@ansiwen ansiwen force-pushed the pr/ansiwen/fix-pointer-passing branch from 4162b0a to ab60ced Compare November 15, 2021 15:29
@ansiwen
Copy link
Collaborator Author

ansiwen commented Nov 15, 2021

I removed the PtrGuard related changes and just added a TODO comment for later reference.

Copy link
Collaborator

@phlogistonjohn phlogistonjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks.

For a little extra context: we discussed this PR earlier today and decided that it would not be the place and time to create a hard dependency on ptrguard given that Go's GC is not a moving GC today.

@mergify mergify bot merged commit b5c4989 into master Nov 15, 2021
@mergify mergify bot deleted the pr/ansiwen/fix-pointer-passing branch November 15, 2021 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-API This PR does not include any changes to the public API of a go-ceph package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants