Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple borrows extension #5

Merged
merged 10 commits into from
Jul 3, 2021
Merged

Conversation

noamtashma
Copy link
Contributor

This adds functionality for getting multiple borrows from different ghostcells at the same time.
I put this under an experimental flag, since it seems that this wasn't proven by the ghostcell paper. However, qcell
seems to already have this functionality for some time, and it definitely sounds sound to me.

In any case, this probably needs a macro, or some other way, to mutably borrow more than 3 elements at a time, and we can't write a method for each number.

@noamtashma
Copy link
Contributor Author

I tried to write a function that receives an &'a [&'a GhostCell<'brand, T>] and borrows all of them, but I haven't managed
to write it nicely. I've run across a few problems:

  • in regards to avoiding quadratic complexity, I can't think of a way to do it without allocation, without shuffling the order of the references.
  • I would like to rust cast or transmute the whole thing into a &'a mut [&'a mut T] and be done with it. However,
    even if we annotate GhostCell as repr(transparent), so the representation are indeed guaranteed to be equal, it's apparently considered unsafe to transmute GhostCell<T> into a T.
  • If instead of casting it we iterate over the references and convert them one by one, we can either return an owned collection (requiring allocation) or an iterator. If we try returning an iterator, the compiler complains about hiding the 'brand lifetime, because 'brand doesn't appear in impl Iterator<Item=&'a mut T> + 'a. Exposing the specific type is possible, but creates a huge and annoying overly-specific signature that principally shouldn't be exposed to the user.

@matthieu-m
Copy link
Owner

First, about your current code:

  1. Nice toggle.
  2. Sized should be unnecessary, but it's too painful on stable NOT to use it, so I agree it's easier for now; we can always lift the requirement later, without breaking any code.

I think there's 2 distinct usecases here:

  1. Tuples, just like the twice and thrice cases you showed here. Those could be generalized using a trait implementation.
  2. Arrays (not slice), leveraging const generics.

The latter is the simpler, I think:

// Provide `borrow_array` for completion's sake; relatively trivial, except for MaybeUninit use.

fn borrow_array_mut<'a, T, const N>(cells: [&'a GhostCell<'brand, T>; N], token: &'a mut GhostToken<'brand, T>)
    -> Option<[&'a mut T; N]>
{
    // 1. Build [*const (); N], Sort it, Check for duplicates.

    Some(unsafe { borrow_array_mut_unchecked(cells, token) })
}

unsafe fn borrow_array_mut_unchecked<T, const N>(cells: [&'a GhostCell<'brand, T>; N], token: &'a mut GhostToken<'brand, T>)
    -> [&'a mut T; N]
{
    //  Build MaybeUninit<[&mut T; N]>, Initialize piecemeal, Return.
}

The tuple case is a little more complicated. Essentially you'd want a (maybe?) sealed trait -- I know there's a trick, though I've never used it -- that is implemented for each tuple arity from 0 to 12, and then a function making use of that special trait:

fn borrow_all_mut<T: GhostTuple<'brand>>(tuple: &T, token: &mut GhostToken<'brand>)
     -> Option<T::OutputMut>
{
     t.borrow_mut(token);
}

And the trait would look something like (unless 2 traits would be better?):

impl<'a, 'brand, T0, T1> GhostTuple<'brand> for (&'a GhostCell<'brand, T0>, &'a GhostCell<'brand T1>) {
     type Output = (&'a T0, &'a T1);
     type OutputMut = (&'a mut T0, &'a mut T1);

     fn borrow(&self, token: &'a GhostToken<'brand>) -> Self::Output { (self.0.borrow(token), self.1.borrow(token)) }

     fn borrow_mut(&self, token: &'a mut GhostToken<'brand>) -> Option<Self::OutputMut> {
         // 1. Build local [*const(); 2], Sort it, Check for Duplicate.

          Some(unsafe { self.borrow_mut_unchecked(token) })
     }

     unsafe fn borrow_mut_unchecked(&self, token: &'a mut GhostToken<'brand>) -> Self::OutputMut {
         //  Just do it!
     }
}

Note: this could be implemented for all arities with a macro, or using build.rs for code generation if the macros are too hairy.

I note that the 2 duplicate checks both rely on local arrays, so there's probably some common code which can be extracted there.


I'd rather have the more generic solutions than specialized implementations for 2 and 3, if that makes sense?

Let me know if you'd like to work on either array or tuple support.

Copy link
Owner

@matthieu-m matthieu-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

I see that you are pushing changes that are not quite going in the direction I was thinking -- as illustrated in my previous comment.

I think it would be best to agree on the direction before you put too much work in this.

Cheers.

src/ghost_cell.rs Outdated Show resolved Hide resolved
src/ghost_cell.rs Outdated Show resolved Hide resolved
}
}
Some(())
// TODO: if the array is large enough, sort the values instead.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, sorting is probably less efficient on small arrays.

Do you have a rough idea of where the tipping point would be?

src/ghost_cell.rs Outdated Show resolved Hide resolved
@noamtashma
Copy link
Contributor Author

Yeah, sorry. As you see this is commit is pretty incomplete. I didn't realize that pushing it to github will immediately add it to the pull request. I'll wait until I have more complete code before pushing it to github in the future.

I basically just worked first on making macros work, and left cleaning it up and making it do the right thing for later. I figured it's better to have separate commits than commit things all together.

Do you prefer to have an associated method or free function? Right now it's an associated method (GhostCell::borrow_mut_twice(...)). On one side, That makes the type a little weird, because the first type comes from the Self type (otherwise using the method results in an unspecified type error).

On the other side, having it as a free function might contaminate the namespace too much, and this function isn't a very important function.

In any case, having both is redundant. I just didn't want to delete the free-function version of the macro just so I didn't lose it.

I'll clean up and work on your feedback soon (hopefully).

@noamtashma
Copy link
Contributor Author

Now the code should be better.

Now that there's a trait for the relevant tuples, I think that having the borrow_mut method be a method of that trait is the right option, as opposed to having a GhostCell associated method or having a free function.

I'm not sure whether to call it just borrow_mut or something more explicit like multiple_borrow_mut. I changed it because when it is directly called on a tuple it's a bit too obvious it acts on multiple values. But I'm not sure. Also I'm generally unsure about the rest of the naming choices as well.

It might nice to implement the same trait for arrays as well, when I get to implementing it for arrays, because then the
interface for the arrays and the tuples will be uniform.

Also, I think I'll move all of the feature-gated code to a single module. It will save lots of #[cfg(...)] attributes.

What do you think?

@noamtashma
Copy link
Contributor Author

should I convert this into a draft pull request? It's still not final.

@matthieu-m
Copy link
Owner

Hi!

I am not very used to github flows, but if there's a draft function it may nice, as it clearly indicates the status of the PR.

Otherwise, as long as we both agree that the PR is a work in progress, it's fine for me. I just want to make sure to calibrate expectations so that you're not waiting for me to merge while I'm waiting for you to complete some work -- let's make sure to be explicit about where we stand.


Regarding your remarks, I think we're on the same wavelength:

  • I was also thinking that using the same trait for both tuples and arrays would be nice. I was thinking of having both GhostBorrow and GhostBorrowMut, as a parallel to core::borrow::{Borrow, BorrowMut}.
  • And once you have a trait, I am not sure about having a free function, seems redundant indeed.
  • I agree that having a separate module would make more sense. Following core precedent, I would call it ghost_borrow. Then the entire module can be cfg'ed in one fell swoop which avoids the noise.
  • One tiny remark: similarly I prefer wrapping all tests into a #[cfg(test)] mod tests { ... } module, nested into the module they test, for the very same reason of having a clear boundary between prod vs test code -- it's especially helpful when one wants to introduce helpers later on.

Regarding the traits, I was thinking that it would be nice to implement them for a variety of types, and notably a variety of values and references:

  1. Reference: &'a [GhostCell<'brand, T>] projected to &'a [T] and &'a mut [T].
  2. Reference: &'a [GhostCell<'brand, T>; N] projected to &'a [T; N] and &'a mut [T; N].
  3. Reference: &'a (GhostCell<'brand, Ts>...) projected to &'a (Ts...) and &'a mut (Ts...).
  4. Value: [&'a GhostCell<'brand, T>; N] projected to [&'a T; N] and [&'a mut T; N].
  5. Value: (&'a GhostCell<'brand, Ts>...) projected to (&'a Ts...) and (&'a mut Ts...).

The latter two, the "Values" 4 and 5, are the problem. You need self, not &self, and yet you need a &'a GhostToken<'brand> for them, which requires either:

  • Lifting 'a and 'brand to the trait level.
  • Or an associated type, like you've done.

I'll just list the two versions close to each others so they can be compared:

trait GhostBorrow<'a, 'brand> {
    type Target;

    fn borrow(self, token: &'a GhostToken<'brand>) -> Self::Target;
}

trait GhostBorrow { // or trait GhostBorrow<'brand> ?
    type Target;
    type Token;

    fn borrow(self, token: Self::Token) -> Self::Target;
}

And the mutable version:

trait GhostBorrowMut<'a, 'brand> {
    type Target;

    fn borrow_mut(self, token: &'a mut GhostToken<'brand>) -> Self::Target;

    unsafe fn borrow_mut_unchecked(self, token: &'a mut GhostToken<'brand>) -> Self::Target;
}

trait GhostBorrowMut { // or trait GhostBorrowMut<'brand> ?
    type Target;
    type Token;

    fn borrow_mut(self, token: Self::Token) -> Self::Target;

    unsafe fn borrow_mut_unchecked(self, token: Self::Token) -> Self::Target;
}

Among those, I think the first alternative, taking &'a [mut] GhostToken<'brand> explicitly would be the easiest to use. I fear that the associated type version would require user writing GhostBorrow<Token = &'a GhostToken<'brand>> in a number of situations, which is somewhat painful.

So I'd lean slightly towards having GhostBorrow<'a, 'brand> and GhostBorrowMut<'a, 'brand>, unless you can think of a problem with those?

@noamtashma
Copy link
Contributor Author

I actually haven't thought of having 'a, 'brand be part of the trait. That does free up the Token associated type, which is nice.
Also I just realized that the method name borrow_mut name collides with the standard library's BorrowMut trait method, so it probably has to be changed back to borrow_mut_multiple.

I have a few questions about the different types you suggest implementing multiple borrowing for.

First of all, all of the immutable borrows seem less useful to me, because that's already easily implementable. If you have some ghost cells, you can already immutably borrow them at the same time, without worrying about gathering them up into a tuple and without worrying about the possibility any of them were the same ghost cell.
However, I do see the benefit of including it just for consistency.

In addition, I'm not sure if options 1,2,3 are useful. It seems that in order to use them, the ghost cells have to be stored in the exact way that they will be borrowed. That is, if you use option 2, you have a [GhostCell<'brand, T>; N]. That means you can basically only mutably borrow your cells in the order that they're stored in.
You also probably could have just stored an GhostCell<'brand, [T; N]> to begin with. And the ghost-cell paper's author's version has a function that converts from GhostCell<'brand, [T]> into [GhostCell<'brand, T>] called as_slice_of_cells (That is implemented as an unsafe cast).

I guess the benefit of options 1,2,3 is that they should probably be implemented as just a cast, removing any overhead apart for the duplicate check. However, I'm unsure that that's safe rust.

Even if GhostCell was tagged as repr(transparent), It seems that that isn't enough to make transmuting from &GhostCell<'brand, T> into a &T safe, even though they are guaranteed to have the same layout. That is in spite of the supposedly proved-safe version of ghost-cell casting from GhostCell<'brand, [T]> into [GhostCell<'brand, T>], which should probably be considered unsafe in the same way.
Maybe I misunderstood the documentation about repr(transparent). I don't know.
Otherwise, I don't think it's possible to implement without a cast.

About option 4, I'm unsure how to implement it. It seems it can implemented as a cast as well. However, implementing it without a cast, I would like to have some map function that would keep track of the fact it still has N elements.
for example, something like the generic_array crate, that was used before const generics were introduced. However, I can't seem to find that function.

I saw you suggested using MaybeUninit. Is that the best way to do that?

@noamtashma
Copy link
Contributor Author

The cast issue seems to be even more complicated than I thought. https://users.rust-lang.org/t/transmuting-types-with-repr-transparent-parameters/26499

@noamtashma
Copy link
Contributor Author

It seems I misunderstood. The RFC (https://rust-lang.github.io/rfcs/1758-repr-transparent.html) says that
"Coercions and casting between the transparent wrapper and its non-zero-sized types are forbidden."
But I think now that casting and coercions are separate things from transmuting. So it seems transmuting is okay. But I'm still not sure.

@matthieu-m
Copy link
Owner

Also I just realized that the method name borrow_mut name collides with the standard library's BorrowMut trait method, so it probably has to be changed back to borrow_mut_multiple.

I'm not sure if it's a problem; I can't think of BorrowMut being implemented for the GhostCell, so I think there's no overlap between the two. And worst case, one can disambiguate with <Type as Trait>::borrow_mut(...).

Unless the clashes are prevalent, I'd favor keeping the shorter name, as that's going to be used a lot.

First of all, all of the immutable borrows seem less useful to me, because that's already easily implementable. If you have some ghost cells, you can already immutably borrow them at the same time, without worrying about gathering them up into a tuple and without worrying about the possibility any of them were the same ghost cell. However, I do see the benefit of including it just for consistency.

I fully agree they are safer to implement, and in the fact the lack of unchecked variant clearly demonstrates it.

I thought of implementing them for two reasons:

  • Some may require unsafe code.
  • Having the implementations enable generic code (over array sizes, or tuple arities).

For me, those are good reasons to do it... but this doesn't mean YOU have to. If you prefer to focus on the mutable case, that's perfectly fine with me.

In addition, I'm not sure if options 1,2,3 are useful. It seems that in order to use them, the ghost cells have to be stored in the exact way that they will be borrowed. That is, if you use option 2, you have a [GhostCell<'brand, T>; N]. That means you can basically only mutably borrow your cells in the order that they're stored in.
You also probably could have just stored an GhostCell<'brand, [T; N]> to begin with. And the ghost-cell paper's author's version has a function that converts from GhostCell<'brand, [T]> into [GhostCell<'brand, T>] called as_slice_of_cells (That is implemented as an unsafe cast).

Indeed. It's not exactly "earth-shattering". It's easy to implement, though, no duplicate check needed, for example, and once you have the traits, you might as well.

Once again, just because I think they'd be nice to provide doesn't mean YOU have to provide them. I am thinking allowed here.

I guess the benefit of options 1,2,3 is that they should probably be implemented as just a cast, removing any overhead apart for the duplicate check. However, I'm unsure that that's safe rust.
[...]

That's a good question, I'm not exactly sure. It's certainly the entire purpose of repr(transparent) but what is or isn't safe... From the RFC you linked, I'd guess transmute is the best option.

About option 4, I'm unsure how to implement it. It seems it can implemented as a cast as well. However, implementing it without a cast, I would like to have some map function that would keep track of the fact it still has N elements. For example, something like the generic_array crate, that was used before const generics were introduced. However, I can't seem to find that function.

I saw you suggested using MaybeUninit. Is that the best way to do that?

I think the only two options would be transmute or MaybeUninit. transmute is less code, but relies on the repr(transparent), MaybeUnit requires writing more code in exchange for not relying on repr(transparent).

I am not sure which is best, per se. I think we can rely on repr(transparent) being good to its word?

One possibility would be to test with MIRI and see if the Stacked Borrow model complains with either implementation. In the end we want an implementation that works under Stacked Borrow rules -- even if they are not perfect -- because that's what is currently available to check things.

@noamtashma
Copy link
Contributor Author

Hi!

I implemented all mutable cases (using transmutes) except case 4. I also added [repr(transparent)], moved all of the feature-gated stuff into a module, and changed the documentation and the trait a bit. All of them are untested though.

I didn't manage to implement case 4. If implemented using transmutes, it complains that the sizes of [A; N] and [B; N] can't be known at compile time. This is apparently a known issue: see here.

If implemented using a Uninit array, we basically get into the same issue, that there's no easy stable interface for assuming that a generic array is initialized (mentioned in the same link; also see tracking issue for unstable solution. There is a workaround, but it's more complicated and uses more unsafe primitives.

I left the attempted implementation of case 4 (using MaybeUninit like you suggested) in comments in the code so that you could try it.

I also ran into an issue of not being able to call the same check_distinct function. However, that's not as bad, as in the worst case that'll just cause code duplication.

Also, a random thought: do you think the function should always return Option<_>, even if it is known at compile time that the cells don't need to be checked? Should one element tuples return Option<_>? It's a small trade-off between consistency and usefulness.
Also, You can consider implementing the trait as well for plain GhostCell.

I guess I will indeed focus on the mutable case only.
It's quite fun working on this small library, I have to say. Thanks again for everything.

@matthieu-m
Copy link
Owner

matthieu-m commented Jun 30, 2021

Hi,

Sorry, I've been a bit rushed so I haven't had time to review the code; I'll get to that later this week hopefully.

I implemented all mutable cases (using transmutes) except case 4. [...]

It's a bit sad that case 4 cannot be implemented, but having 4 out of 5 is better than none; we can always add 4 later.

I also ran into an issue of not being able to call the same check_distinct function. However, that's not as bad, as in the worst case that'll just cause code duplication.

Hopefully fixable; I'll have a look when I review. Otherwise not as bad indeed.

Also, a random thought: do you think the function should always return Option<_>, even if it is known at compile time that the cells don't need to be checked?

That's an interesting question.

I think it's better for generic functions to have a generic interface in general; after all, the generic functions are only called in generic context, so a divergent interface in those cases is just created more troubles for the caller.

In this specific case, however, this raises the possibility of using Result<_, _> instead, and an error type that is an associated type of the trait implementation.

The interface remains generic for those who want to handle the generic case, complete with an error type to use, however the specific infallible implementations can use a Never Type (aka !, though if unstable an uninhabited enum works just well). rustc recognizes Never Types, know they cannot be instantiated, and therefore can elide the error branch in the code.

Furthermore, users can check for the presence of infallible borrow by constraining the error type: GhostBorrowMut<Error = !>.

As a result, I think this is the best of both worlds:

  • Generic and uniform.
  • Yet, allowing for specialization of code for infallible cases, both by user and compiler.

@matthieu-m
Copy link
Owner

Hi,

I review the code and it all looks good to me.

Is there anything more you'd like to do? If not I'll go ahead and merge it, then iterate on it to non-mutable / mutable-unchecked methods, etc...

@noamtashma
Copy link
Contributor Author

No, I think that's pretty much it.

:)

@matthieu-m matthieu-m merged commit 1c35a4b into matthieu-m:master Jul 3, 2021
@matthieu-m
Copy link
Owner

Thank you very much for your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants