Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-use same outs & chosen mix outs across tx construction attempts #36

Merged
merged 6 commits into from
Jul 18, 2022

Conversation

j-berman
Copy link
Contributor

@j-berman j-berman commented Feb 28, 2022

Requesting review from @paulshapiro on this general architecture. It kinda throws somewhat of a wrench into the initial step1 -> step2 design.*

Context

When a client attempts to construct a transaction, it constructs a mock tx first using an estimate for the fee, then retries using a a higher fee that is the result of the more accurate calculate_fee(initial tx). Sometimes the client needs to make multiple retry attempts.

Also, when a transaction is constructed, the tx references prior outputs in the chain. These references are stored on transactions as "key offsets" (e.g. [ 301212232, 3827734, 312343, 15332, 193885, 14006, 599, 183, 5003, 101, 12003]), which are described well here:

Key offsets are the set of outputs your ring is using as "fake"outputs, as well as yours. Outputs of a given denomination are ordered in blockchain order, and thus can be represented by their index in that list. This is smaller than using the public key. Moreover, they're stored as offsets from the previous one (the first one from 0), as this will result in smaller values, which can often result in a yet smaller amount of data, since those numbers are written out in a variable length output (kinda like UTF-8 in rough outline).

Each tx construction retry attempt, the client fetches new decoys from the server, resulting in distinct key offsets for each attempt.

The problem

The key offsets of a transaction can vary in byte size across transaction attempts. This means the fee used in a transaction is unlikely to be the result of calculate_fee on the final transaction (which takes as input the tx's byte size when calculating the fee), but rather, the result of calculate_fee of the penultimate transaction construction attempt. This behavior is different from wallet2, which re-uses the same set of outs->mix_outs across transaction construction attempts, resulting in wallet2 constructing transactions that use fees equivalent to calculate_fee of the transaction stored on chain in practice. Thus, this difference from wallet2 should be fingerprintable on chain.

The solution

Re-use the same set of outputs and chosen mix outs for each output across transaction construction attempts. This avoids the above problem, matches wallet2, and also usually removes the need to make a trip to the server to fetch more decoys on subsequent tx construction attempts after the first.

[DONE 7/11/22] Lingering to-do's

I only updated 1 code path that the desktop app seems to use here: mymonero/mymonero-libapp-cpp#4

Other code paths are not using this update. Leaving that to you guys because I'm pretty confused how the other stuff should be changed/tested :)

*: I initially prefixed this new function with step1_and_a_half because I thought it would be funny, but decided that would be unprofessional :)

@fluffypony
Copy link

@j-berman FYI Paul was fired from MyMonero nearly 2 years ago.

@j-berman
Copy link
Contributor Author

@fluffypony ok but it was a valuable comment to improving my code and making it safer/easier to use, and it's open source, so figure it doesn't really matter? He wrote the structure initially, so I figured he'd have good thoughts on this since it has a material impact on the structure. And he did. I think I remember all the things he said specifically as it relates to how my code can be improved, so going to repeat for posterity:

  • rename tie_outs_to_mix_outs to something that more clearly shows how the outs are from a prior construction attempt, more in line with their function
  • rename passedIn_attempt_at_fee to something more explicit that this fee was calculated in the prior construction step
  • add a test vector for this new function tie_outs_to_mix_outs
  • highlight the code paths that warrant further investigation to ensure all consumers of these functions get this update to prevent them from fingerprinting

And in general, make an effort toward converging on code used in wallet2, potentially not needing to rewrite duplicate code that also needs to be updated and maintained, to try and avoid issues like this. Which I think everyone seems to prefer if it's possible

@fluffypony
Copy link

@j-berman sure - but this is largely kept around for historical significance; you may be better served using the Monero CPP library: https://github.com/monero-ecosystem/monero-cpp

@j-berman
Copy link
Contributor Author

j-berman commented Mar 1, 2022

@fluffypony confirmed this repo is almost certainly the code that MyMonero is using in production. Even if not using this exact repo, this PR is almost certainly a fix for whatever code MyMonero is using today.

I used the latest release of MyMonero's desktop app to make this transaction: d1ec19b652af6679082a1a2a27aeec4c9230ca33262759f1c120e8a1c8ac26f1

Then I modified blockchain_usage.cpp to analyze this transaction here: https://paste.debian.net/1232603/

Running that outputs:

The transaction's fee stored on chain is: 12280000

The estimated fee that MyMonero would calculate for this transaction is: 12280000

MyMonero must have used the estimated fee, therefore this is very likely the code that is being used by MyMonero in production (whatever code is being used has this bug in it regardless).

This fee calculation is distinct from wallet2 and therefore fingerprintable to code that has the same issues this PR and #35 are fixing.

@fluffypony
Copy link

@j-berman ah yeah - to be clear, I didn't meant to imply it isn't in limited use, just that (generally speaking) efforts to align things with wallet2 might be wasted :)

@j-berman
Copy link
Contributor Author

j-berman commented Mar 7, 2022

Updated this and mymonero/mymonero-libapp-cpp#4 as per @paulshapiro 's comments.

Of note, I've left the following 2 areas untouched (these 2 areas do not utilize the fix of this PR), and seem like may impact consumers of this code:

(1) serial_bridge_index

(2) async__send_funds

  • this one I'm less certain on. This should probably either be updated or removed as well.

@fluffypony I think there is some confusion here so just going to clarify. In speaking with @devinpearson , AFAIU there aren't any immediate plans to transition away from using this repo's code. There are ideas floating around, and it would be really nice to not have to try to match wallet2 (and instead just re-use its internal components without modification), but AFAIU that isn't in the immediate plans. It's an involved task that is fairly challenging. So this PR is fairly important to get right for the time being to stop fingerprinting users to this repo's code.

I agree that a general push toward not needing to write/maintain code that must maintain parity with wallet2 would be ideal, long-term.


Speaking for myself, this code benefited from Paul's comment. I'm grateful that I realized I still had it in my e-mail to be able to re-read and apply his suggestions. Here is everything from his comment that I applied in this PR:

Yes, this would be good, but I cannot yet approve of adding the impl for tie_outs_to_mix_outs as it has no test callers within this repo, core-cpp, since no entrypoints to the transaction creation exist in this repo anymore after I left. What would you say to adding a test case for tie_outs_to_mix_outs within this repo?

Secondly I would add that we could improve the name tie_outs_to_mix_outs slightly by qualifying which "outs" we are "tieing", i.e. something like tie_prev_attempt_outs_to_mix_outs and then distinguishing between decoy and unspent outs further.. And i think the word tie is good, but I would suggest it could be made more concrete to explain its purpose and problem it solves, if that makes sense. Something like what you are explaining in the function impl comment. That way it makes it clear to coders coming from arbitrary contexts what the function does and should be used for.

On that topic, to improve clarity around which fee is used, and to have avoided this problem, I would suggest we could slightly improve the name passedIn_attemptAt_fee, which is found in multiple places, by calling it something like... optl__try_this_last_size_calcd_fee ... which is admittedly slightly long but not that bad. We could even suffix it with _else_estimate to explain what the algo will do if it's boost::none.

I initially prefixed this new function with step1_and_a_half because I thought it would be funny, but decided that would be unprofessional :)

LOL! it's not actually the worst idea in the world though. Perhaps call it pre_step2_.... And maybe we could use the ½ unicode like this https://github.com/oxen-io/oxen-core/blob/ecb62e8faba07a7ec003435890d8053ef6088aad/src/wallet3/wallet2%C2%BD.cpp#L6

I would suggest we double-verify that there are absolutely no cases where you'd actually want to go and request fresh outs during a reconstruction.

Leaving that to you guys because I'm pretty confused how the other stuff should be changed/tested :)

For reference, which other codepaths should be noted? We wouldn't want to leave any means by which a user could end up consuming something fingerprintable.

I also redacted Paul's comments on MyMonero generally, and his thoughts on future plans to try and migrate light wallet code to converge on wallet2/core code. I feel uncomfortable doing this, I felt it was useful information for people. I didn't want to get involved in this beef. Unfortunately, it seems I can't help myself. Regardless of historical context, comments on code should be left to stand on their own merits, especially on open source code. It would have been unfortunate to have lost his comment, not just for the sake of improving this PR, but also more existentially, I prefer to live in a world where opinions and thoughts are shared freely and live or die on their own merit. My 2 picos.

@j-berman
Copy link
Contributor Author

Updated serial_bridge_index and async__send_funds

j-berman added 6 commits July 15, 2022 16:34
…ction attempts

Upon completing round 1 of tx construction, the client determines
if the fee used in the tx is > result of calculate_fee(). If not,
it re-attempts the tx using the result of calculate_fee(). If the
client does not re-use the same set of mix_outs from the prior
attempt for each out, then the size of the transaction likely
ends up being a little different (because transactions reference
mix outs by on optimized offset, which varies in size depending
on the set of mix outs), even though the attempt is targeting
a fee for a transaction of an expected size. This can result in
the transaction including a fee that is *not* the result of
calculate_fee(final_tx). Thus, this change provides a way for
the client to re-use mix outs across attempts, so that the final
fee is equivalent ot the result of calculate_fee. This is now
implemented the same way as wallet2 (which also re-uses the
same set of decoys across attempts).

Bonus: it avoids trips to the server to get new decoys every
tx attempt as well.
@j-berman j-berman force-pushed the tie-outs-to-mix-outs branch from fab3d09 to 71e7983 Compare July 15, 2022 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants