Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Optimized Spl-token program #31

Merged
merged 22 commits into from
Nov 5, 2024
Merged

Added Optimized Spl-token program #31

merged 22 commits into from
Nov 5, 2024

Conversation

L0STE
Copy link
Contributor

@L0STE L0STE commented Oct 30, 2024

Added most of the instruction from spl-token program using unsafe rust for optimization w/ @deanmlittle

Some example on how the instruction are constructed:

  • For data (trasfer_checked in this occasion):
let mut instruction_data = MaybeUninit::<[u8; 10]>::uninit();

// Populate data
unsafe {
    let ptr = instruction_data.as_mut_ptr() as *mut u8;
    // Set discriminator as u8 at offset [0]
    *ptr = 12;
    // Set amount as u64 at offset [1]
    *(ptr.add(1) as *mut u64) = self.amount;
    // Set decimals as u8 at offset [9]
    *ptr.add(9) = self.decimals;
}

let instruction = Instruction {
    program_id: &crate::ID,
    accounts: &account_metas,
    data: unsafe { &instruction_data.assume_init() },
};
  • For accounts (transfer_checked):
let account_metas: [AccountMeta; 4] = [
    AccountMeta::writable(self.from.key()),
    AccountMeta::readonly(self.mint.key()),
    AccountMeta::writable(self.to.key()),
    AccountMeta::readonly_signer(self.authority.key()),
];

To test it out and as an example on how to integrate, I created a repo that have all the CPI to this program and all the tests that actually look if the data is saved correctly at this link: https://github.com/L0STE/pinocchio-spl-examples

@febo
Copy link
Collaborator

febo commented Oct 31, 2024

This is great! I left a few comments, but overall looks very good. The main one is about UB when you get a pointer from a MaybeUninit. We can only do that if the value is initialized. I think it would be fine to just replace the use of MaybeUninit for static allocated arrays – the compiler will probably optimize the initialization.

Also, could you please run cargo fmt and clippy? Otherwise CI will keep complaining. 😀

@L0STE
Copy link
Contributor Author

L0STE commented Oct 31, 2024

Writing here a new comment addressing some of the changes:

Not sure if you were referring to something else but happy to change it if that's the case!

  • Pubkey deferencing
    Good catch on the Pubkey dereferencing, didn't think about that. Changed all the function + tests CPI to suits that changes and seeing if any UB got introduced but seems everything is good on that front

  • UB from pointers from a MaybeUninit
    Instead of getting rid of the MaybeUninit and using the zeroed out array, we would like to propose this alternative solution:

let mut instruction_data = MaybeUninit::<[u8; 10]>::uninit();

// Populate data
unsafe {
    let buf = instruction_data.assume_init_mut();
    
    // Set discriminator as u8 at offset [0]
    buf[0] = 12;
    // Set amount as u64 (convert to bytes and copy into buffer)
    buf[1..9].copy_from_slice(&self.amount.to_le_bytes());
    // Set decimals as u8 at offset [9]
    buf[9] = self.decimals;
}

let instruction = Instruction {
    program_id: &crate::ID,
    accounts: &account_metas,
    data: unsafe { &instruction_data.assume_init() },
};

If this is satisfying I'll change the instructions and run additional tests

@febo
Copy link
Collaborator

febo commented Oct 31, 2024

Writing here a new comment addressing some of the changes:

Not sure if you were referring to something else but happy to change it if that's the case!

I think what is tricky of UB is that it might work, but "break" in a different setting (arch or compiler version). From the docs, when a raw pointer is dereferenced (using the * operator), it must be non-null and aligned. I flagged a couple of values where the pointer won't be aligned to the type, which might work now but could break in the future.

A workaround would be to define the u64 types as byte arrays, e.g.:

pub struct Mint {
    pub mint_authority_present: [u8; 4],

    pub mint_authority: Pubkey,

    pub supply: [u8; 8],

    pub decimals: u8,

    pub is_initialized: bool,

    pub freeze_authority_present: [u8; 4],

    pub freeze_authority: Pubkey,
}

impl Mint {
    pub fn supply(&self) -> u64 {
        u64::from_le_bytes(self.supply)
    }
}

You can then cast the raw pointer into this type to access its fields. The "penalty" from the from_le_bytes should be relatively small. The other alternative would be to use core::ptr::read_unaligned — not sure if this one would perform better than the from_le_bytes.

  • Pubkey deferencing
    Good catch on the Pubkey dereferencing, didn't think about that. Changed all the function + tests CPI to suits that changes and seeing if any UB got introduced but seems everything is good on that front

Great! 👍

  • UB from pointers from a MaybeUninit
    Instead of getting rid of the MaybeUninit and using the zeroed out array, we would like to propose this alternative solution:
let mut instruction_data = MaybeUninit::<[u8; 10]>::uninit();

// Populate data
unsafe {
    let buf = instruction_data.assume_init_mut();
    
    // Set discriminator as u8 at offset [0]
    buf[0] = 12;
    // Set amount as u64 (convert to bytes and copy into buffer)
    buf[1..9].copy_from_slice(&self.amount.to_le_bytes());
    // Set decimals as u8 at offset [9]
    buf[9] = self.decimals;
}

let instruction = Instruction {
    program_id: &crate::ID,
    accounts: &account_metas,
    data: unsafe { &instruction_data.assume_init() },
};

If this is satisfying I'll change the instructions and run additional tests

I think this is still UB: https://doc.rust-lang.org/beta/std/mem/union.MaybeUninit.html#method.assume_init_mut

  • From the docs: Calling this [assume_init_mut] when the content is not yet fully initialized causes undefined behavior: it is up to the caller to guarantee that the MaybeUninit really is in an initialized state. For instance, .assume_init_mut() cannot be used to initialize a MaybeUninit.

I think the compiler will do a good job in optimizing this even with a "normal" zeroed static array since the array gets populated immediately after.

@L0STE
Copy link
Contributor Author

L0STE commented Nov 2, 2024

Update on the final PR:

- General nit fixes:
Deleted the tests folder, forgot to delete that when I moved the tests to the spl-pinocchio-example repo + ran cargo fmt and clip so now should be good to go on that point.

- Alignment Issues
Decided to opt for unsafe { core::ptr::read_unaligned(self.0.add(64) as *const u64) } as a solution since I tested it and didn't add any CUs.

- try_borrow_data() for Mint and Token
We opted to change the from_account_info_unchecked function to use try_borrow_data() as suggested but we added an unsafe function from_account_info_unchecked_unsafe that doesn't track the borrow cause using try_borrow_data() added 6CUs so it might be a good idea to just give the user the possibility of choosing.

/// # Safety
/// Use this when you don't care to keep track of the borrows &
/// the AccountInfo will have the correct owner and account space
#[inline(always)]
pub unsafe fn from_account_info_unchecked_unsafe(account_info: &AccountInfo) -> Self {
    Self(account_info.borrow_data_unchecked().as_ptr())
}

/// # Safety
/// Use this when you don't care to keep track of the borrows
pub unsafe fn from_account_info_unsafe(account_info: &AccountInfo) -> Self {
    assert_eq!(account_info.data_len(), Self::LEN);
    assert_eq!(account_info.owner(), &ID);
    Self::from_account_info_unchecked_unsafe(account_info)
}

/// Use this when you know the AccountInfo will have the correct owner and account space
#[inline(always)]
pub fn from_account_info_unchecked(account_info: &AccountInfo) -> Result<Self, ProgramError> {
    let data = account_info.try_borrow_data()?;
    Ok(Self(data.as_ptr()))
}

pub fn from_account_info(account_info: &AccountInfo) -> Result<Self, ProgramError> {
    if account_info.data_len() != Self::LEN {
        return Err(ProgramError::InvalidAccountData);
    }
    if account_info.owner() != &ID {
        return Err(ProgramError::InvalidAccountData);
    }
    Self::from_account_info_unchecked(account_info)
}
  • write_bytes() helper
    I added the write_bytes() helper in the lib.rs so we could reference it for every instruction. Might make sense to add it in the pinocchio general sdk but didn't want to make the choice myself. So leaving this discussion here for you guys to decide on

@febo
Copy link
Collaborator

febo commented Nov 2, 2024

Amazing! 🙌

- Alignment Issues Decided to opt for unsafe { core::ptr::read_unaligned(self.0.add(64) as *const u64) } as a solution since I tested it and didn't add any CUs.

Great!

- try_borrow_data() for Mint and Token We opted to change the from_account_info_unchecked function to use try_borrow_data() as suggested but we added an unsafe function from_account_info_unchecked_unsafe that doesn't track the borrow cause using try_borrow_data() added 6CUs so it might be a good idea to just give the user the possibility of choosing.

I think we might have an issue here with the fact that you are taking a raw pointer and the borrow going out of scope. I will have a closer look at the code to confirm this.

  • write_bytes() helper
    I added the write_bytes() helper in the lib.rs so we could reference it for every instruction. Might make sense to add it in the pinocchio general sdk but didn't want to make the choice myself. So leaving this discussion here for you guys to decide on

Read my mind. 😀 I think we can move it to an "utility" crate (e.g., pinocchio-invoke). There was also a suggestion of adding an Invoke trait to instruction structs (I guess it would facilitate passing them as arguments or create CPI "bundles"), so this trait could also be part of this new crate.

I will review the PR shortly - thanks for addressing the comments!

#[inline(always)]
pub fn from_account_info_unchecked(account_info: &AccountInfo) -> Result<Self, ProgramError> {
let data = account_info.try_borrow_data()?;
Ok(Self(data.as_ptr()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, this does not "keep" the borrow. The Ref returned by try_borrow_data() will drop once the method returns, but the type will keep the raw pointer. So it will look like the data is not borrowed while there is still a "live" raw pointer. Perhaps the solution is to hold the Ref<[u8]> instead of *const u8.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, maybe all we need is to map the Ref<[u8]> to a Ref<Mint>:

Ref::map(account_info.try_borrow_data()?, |data| Self(data.as_ptr()))

The return type of the method then becomes Result<Ref<Mint>, ProgramError>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, maybe all we need is to map the Ref<[u8]> to a Ref<Mint>:

Ref::map(account_info.try_borrow_data()?, |data| Self(data.as_ptr()))

The return type of the method then becomes Result<Ref<Mint>, ProgramError>.

I think it's better too, more convenient for the user

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is still correct, but I needed to add an unsafe cast because the compiler was yelling at me so I went with this for the code solution at the end:

pub fn from_account_info(account_info: &AccountInfo) -> Result<Ref<Mint>, ProgramError> {
    if account_info.data_len() != Self::LEN { return Err(ProgramError::InvalidAccountData) }
    if account_info.owner() != &ID { return Err(ProgramError::InvalidAccountData) }
    Ok(Ref::map(account_info.try_borrow_data()?, |data| {
        unsafe { &*(data.as_ptr() as *const Mint) }
    }))
}

#[inline(always)]
pub fn from_account_info_unchecked(account_info: &AccountInfo) -> Result<Self, ProgramError> {
let data = account_info.try_borrow_data()?;
Ok(Self(data.as_ptr()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here about the drop of the Ref<[u8]>.

Copy link
Collaborator

@febo febo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super close, just a couple of things.

@febo
Copy link
Collaborator

febo commented Nov 4, 2024

@L0STE I had a second look at the *_unchecked and *_unsafe helpers – in particular to see if we can simplify the *_unchecked_unsafe naming.

Wonder if the following makes sense:

  • from_account_info(account_info: &AccountInfo): does the owner/length validation + returns a Ref<T>

  • from_account_info_unchecked(account_info: &AccountInfo): does the owner/length validation + returns a T (unchecked borrow)

  • from_bytes(bytes: &[u8]): returns T (the caller controls whether to pass an unchecked borrow or not).

@L0STE
Copy link
Contributor Author

L0STE commented Nov 5, 2024

@L0STE I had a second look at the *_unchecked and *_unsafe helpers – in particular to see if we can simplify the *_unchecked_unsafe naming.

Wonder if the following makes sense:

  • from_account_info(account_info: &AccountInfo): does the owner/length validation + returns a Ref<T>
  • from_account_info_unchecked(account_info: &AccountInfo): does the owner/length validation + returns a T (unchecked borrow)
  • from_bytes(bytes: &[u8]): returns T (the caller controls whether to pass an unchecked borrow or not).

I think this totally make sense, this is my interpretation of this, let me know If I'm close:

/// Performs owner and length validation on `AccountInfo` and returns a `Ref<T>` for safe borrowing.
pub fn from_account_info(account_info: &AccountInfo) -> Result<Ref<Mint>, ProgramError> {
    if account_info.data_len() != Self::LEN { return Err(ProgramError::InvalidAccountData) }
    if account_info.owner() != &ID { return Err(ProgramError::InvalidAccountData) }
    Ok(Ref::map(account_info.try_borrow_data()?, |data| {
        unsafe { &*(data.as_ptr() as *const Mint) }
    }))
}

/// # Safety
/// Performs owner and length validation on `AccountInfo` but performs unchecked borrowing and 
/// returns a `T` directly.
#[inline(always)]
pub unsafe fn from_account_info_unchecked(account_info: &AccountInfo) -> Result<Mint, ProgramError> {
    if account_info.data_len() != Self::LEN { return Err(ProgramError::InvalidAccountData) }
    if account_info.owner() != &ID { return Err(ProgramError::InvalidAccountData) }
    Ok(Self::from_bytes(account_info.borrow_data_unchecked().as_ref()))
}

/// # Safety
/// Constructs a `T` directly from a byte slice. The caller must ensure that `bytes` contains a 
/// valid representation of `T`.
pub unsafe fn from_bytes(bytes: &[u8]) -> Self {
    core::ptr::read(bytes.as_ptr() as *const Mint)
}

Copy link
Collaborator

@febo febo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@febo febo merged commit d646338 into anza-xyz:main Nov 5, 2024
3 checks passed
let instruction = Instruction {
program_id: &crate::ID,
accounts: &account_metas,
data: unsafe { from_raw_parts(instruction_data.as_ptr() as _, 9) },

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just slice here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an array of [MaybeUninit<u8>] so we need to cast it to [u8].


#[inline(always)]
fn write_bytes(destination: &mut [MaybeUninit<u8>], source: &[u8]) {
for (d, s) in destination.iter_mut().zip(source.iter()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you check that this actually desugars to memcpy and not to a bunch of assignments?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably can replace this by a memcpy. It is using an iteration to call write since each position is a MaybeUninit<u8> but since we know the size in advance, we might be able to use ptr::copy_nonoverlapping instead.

@febo febo mentioned this pull request Nov 13, 2024
@L0STE L0STE deleted the spl-token branch November 17, 2024 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants