-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cleanup and update Mining Protocol specs #98
base: main
Are you sure you want to change the base?
Conversation
11ae8cd
to
6bb2c1c
Compare
e481ac2
to
3e986cd
Compare
It would also be useful to add the .dot files you used to generate these graphics, so they can be more easily modified in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @plebhash, I really like it! 👏
I left some comments, and questions which can be topics of some discussion
|
||
The size of search space for an extended channel is `2^(NONCE_BITS+VERSION_ROLLING_BITS+extranonce_size*8)` per `nTime` value. | ||
The size of the search space for one Standard Job, given a fixed `nTime` field, is `2^(NONCE_BITS + BIP320_VERSION_ROLLING_BITS) = ~280Th`, where `NONCE_BITS = 32` and `BIP320_VERSION_ROLLING_BITS = 16`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like more the phrase:
"The size of the search space for one Standard Job is 2^(NONCE_BITS + BIP320_VERSION_ROLLING_BITS) = ~280Th
, per nTime
value, where NONCE_BITS = 32
and BIP320_VERSION_ROLLING_BITS = 16
."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to this, does it mean we should think about the fact that newer machines are already surpassing this 280Th/s limit? (look at S21 hydro or S21 XP hydro)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HOM is not intended for powerful machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm so this phrase is wrong?
All SV2 mining devices are restricted to Standard Jobs. This is a big difference from legacy SV1 mining devices, where rolling extranonces is a common feature.
I wrote it based on my current understanding, and after reaching out to Braiins Firmware Support (Marin) and getting confirmation that SV2 Braiins Firmware is restricted to HOM.
Can some powerful SV2 Mining Device open Extended Channels and receive NewExtendedMiningJob
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was a bit confusing to me, because it made me think: "what is the purpose of HOM Standard Jobs after all?"
and based on this question #98 (comment), I believe @GitGab19 also shares this feeling
I thought about @Fi3 answer and here's my current perspective on the trade-offs of Standard Jobs:
rolling Extranonces is an "advanced feature" that makes firmware more complex and harder to maintain, so from a business perspective, it should be avoidable if possible.
so a way to correct the phrase above would be:
All SV2 mining devices with hashrate under 280 TH/s should be restricted to HOM via Standard Jobs. This is a big difference from legacy SV1 mining devices, where rolling extranonces is a common feature. Hashing space optimization should be achieved upstream, where efficient telemetry is achieved by keeping track of multiple Standard Jobs via Group Channels.
If the SV2 mining device hashrate is above 280 TH/s, then it should support non-HOM via Extended Jobs.
hopefully the spec authors can confirm this in the near future (when we ask them to review this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think HOM can be used for almost any machine, provided it doesn't exceed the hash rate by orders of magnitude.
The device is supposed to receive new mining jobs periodically. Every new job resets the ntime value again to the min_ntime
for a given job. So unless the time doesn't roll hours into the future within the usual 30 second window, I don't think it's a problem.
Problem with rolling nTime too much ahead of the true time is equivalent to a problem of "not receiving a new mining job frequently enough".
It's difficult to draw some hard line. It's probably better to be a bit vague about that. Say something like "be sane about nTime rolling or otherwise your work will be rejected".
quick calculation:
Assume we allow nTime to roll 10 minutes into the future for a given mining job that is being received every 30 seconds. In order to get to that point we would need 5.6 PH/s (=280 * 600 / 30). I don't think we are going to see such devices in foreseeable future (simple device controlled by a single controller - more complex devices split across multiple machines need to open an extended channel).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the clarification!
I'll tag @rrybarczyk here since this topic was part of a discussion we had not long ago, while we went through this PR.
When we had this discussion, my understanding was that HOM over Standard Jobs were designed to cater for Mining Devices with hashrate below 280 TH/s. This clarifies that this is not the case at all. HOM is perfectly suitable for almost any kind of Mining Device, as long as jobs are updated frequently.
But now that leaves me wondering: is there any scenario where Mining Devices would open Extended Channels with an upstream?
05-Mining-Protocol.md
Outdated
A Mining Proxy can forward multiple Standard Channels to the Upstream, ideally one representing each SV2 Mining Device Downstream. | ||
The Upstream (Pool, JDC or another Mining Proxy) is expected to send a `SetGroupChannel` message aggregating different Standard Channels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the SetGroupChannel
triggered by some specific behaviour?
Or the trigger to send it can be arbitrary defined by pools?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the main criteria I can think is: all Standard Channels under the same Connection are unified under the same Group Channel
but maybe the spec authors can elaborate on some other scenarios
05-Mining-Protocol.md
Outdated
The Upstream (Pool, JDC or another Mining Proxy) is expected to send a `SetGroupChannel` message aggregating different Standard Channels. | ||
The Upstream sends jobs via `NewExtendedMiningJob` to Group Channels, and if some Downstream is a Mining Device (i.e.: the `SetupConnection` had the `REQUIRES_STANDARD_JOBS` flag) the Mining Proxy converts that message into different `NewMiningJob` messages after calculating the correct Merkle Root based on each Standard Channel's `extranonce_prefix`. | ||
|
||
Alternatively, a Mining Proxy could aggregate multiple Donwstream Standard Channels into a single Extended Channel. The Upstream sends Jobs via `NewExtendedMiningJob`, and for each Downstream SV2 Mining Device, the Mining Proxy sends different `NewMiningJob` message, where the Merkle Root is based on the Standard Channel's `extranonce_prefix`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What could be the reason why a proxy prefers to open many standard channels (with upstream), which are then grouped in a group channel? Why that proxy is not directly opening an extended channel with upstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked a similar question when I was looking at some implementation details
stratum-mining/stratum#1145 (reply in thread)
@Fi3 answer was:
it might want to group standard channels from downstream, maybe a pool want to go that way to have telemetry or for whatever other reasons. But you should ask the spec's authors.
Hopefully we will get some reviews by the spec authors here soon, so they could also help further clarify this in more detail.
The graphics were generated via I actually think this is a good idea. |
That's a good start. |
69d2474
to
2c058bf
Compare
review by @mcelrath plebhash/sv2-spec@update-mining-protocol...mcelrath:sv2-spec:patch-1 dropping this link here so I can dive into it later |
|
||
### 5.1.4 Future Jobs | ||
- The `extranonce_prefix` bytes are reserved for the upstream layer, where fixed bytes were already established for the Extranonce and are no longer available for rolling or search space splitting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get an explanation of why they are needed, what purpose they serve, and how they are divided?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the extended extranonce field is an array of 32 bytes, let's assume we split it into x, y, and z. The downstream nodes use standard channels, and with one group channel, we can accommodate up to 2^32 standard channels. However, since x is non-zero and y + z < 32, we won't be able to support the maximum number of downstream channels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get an explanation of why they are needed, what purpose they serve, and how they are divided?
do you mean Extended Extranonces as a whole, or extranonce_prefix
specifically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the extended extranonce field is an array of 32 bytes, let's assume we split it into x, y, and z. The downstream nodes use standard channels, and with one group channel, we can accommodate up to 2^32 standard channels. However, since x is non-zero and y + z < 32, we won't be able to support the maximum number of downstream channels.
the wording on 2^32 channels per connection says:
There can theoretically be up to 2^32 open Channels within one Connection.
up to means this is a theoretical ceiling, but it does not imply this capacity will be filled every time... this should be interpreted as "there will never be more than 2^32 Channels per Connection" rather than "there can always be 2^32 Channels per Connection"
the only layer where the 2^32 capacity could ever be used is the most upstream layer (Poor or JDC), and by definition there's no fixed extranonce_prefix
bits on this layer
but thanks for the feedback, I updated this phrase to make it more clear:
There can theoretically be up to
2^32
open Channels within one Connection. This is however just a theoretical ceiling, and it does not mean that every Connection will be able to fill this full capacity (maybe the search space has already been narrowed).
05-Mining-Protocol.md
Outdated
An empty future block job or speculated non-empty job can be sent in advance to speedup new mining job distribution. | ||
The point is that the mining server MAY have precomputed such a job and is able to pre-distribute it for all active channels. | ||
The only missing information to start to mine on the new block is the new prevhash. | ||
![](./img/extended_extranonce_proxies.png) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This image might be misleading for the pure SV2 side, as it ends the flow with the merkle_root. Can we have something like no rolling merkle root
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When transitioning from one proxy to another, are we concatenating the extranonce_prefix and the local reserved portion, or are we hashing them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This image might be misleading for the pure SV2 side, as it ends the flow with the merkle_root. Can we have something like no rolling merkle root
I agree that was misleading. Changed the Mining Device layer to extranonce_prefix
, because that is was is actually assigned to each Standard Channel.
Only later, when Standard Jobs are created, the unique Merkle Root is calculated (coinbase_prefix
+extranonce_prefix
+coinbase_suffix
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When transitioning from one proxy to another, are we concatenating the extranonce_prefix and the local reserved portion, or are we hashing them?
we concatenate.
- `extranonce_size`: how many bytes are available for the locally reserved and downstream reserved areas of the Extended Extranonce. | ||
|
||
And a [Standard Channel](#531-standard-channels) has the following property: | ||
- `extranonce_prefix` the Extended Extranonce bytes that were already allocated by the upstream server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does a standard channel even need extra nonce information if it's only used for header-only mining?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have less messages to send from the pool to the proxy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiple standard channels are aggregated into a group channel
when an ExtendedMiningJob
message comes into the group channel, it needs to be converted it into multiple NewMiningJob
messages
each NewMiningJob
message needs to have a unique Merkle Root, otherwise there would be collision in the search space
the uniqueness of each Merkle Root is determined by the static extranonce_prefix
of each standard channel (by combining it with coinbase_prefix
and coinbase_suffix
from the Extended Job that came from upstream)
overall, the trick to assimilate this is to think in terms of proxy layers: usually, there are multiple layers of extended jobs coming from upstream, and only on the last layer they are converted into standard jobs, where Merkle Roots have to be unique
05-Mining-Protocol.md
Outdated
Each Channel identifies a dedicated mining session associated with an authorized user. | ||
Upstream stratum nodes accept work submissions and specify a mining target on a per-channel basis. | ||
|
||
There can theoretically be up to `2^32` open Channels within one Connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please point me to the part of the codebase where we are implementing the multiplexing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean by multiplexing here.
but maybe the channel_logic
module of roles_logic_sv2
could provide you some insight (as well as its usage on roles
crates)
|
||
A proxy can either transparently allow its clients to open separate Channels with the server (preferred behavior), or aggregate open connections from downstream devices into its own open channel with the server and translate the messages accordingly (present mainly for allowing v1 proxies). | ||
Both options have some practical use cases. | ||
In either case, proxies SHOULD aggregate clients' Channels into a smaller number of Connections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of the SV1 miners are going to have a standard channel since they require a non-homogeneous setup. How does aggregation work in that case? At the same time, I have this question: each of these miners (clients) connects to the translator proxy, and each connection is a TCP connection. So how are we achieving multiplexing? My understanding was that with a single TCP connection, we could manage around 2^32 channels, but these channels should point to the same endpoint and starting point. Each of these miners connects to the same endpoint (the translator proxy), but they each have different ports. Each server can only handle a limited number of TCP connections, which is nowhere near 2^32—more like on the order of 10^5. So, the aggregation doesn’t happen at the level of the translator proxy and the miner, but rather between the translator proxy and another upstream connection, since they will have a single TCP connection through which we can have 2^32 channels for each downstream miner node. However, this raises a concern: the translator proxy becomes a single point of failure, so the provider needs to implement scaling strategies to address that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of the SV1 miners are going to have a standard channel since they require a non-homogeneous setup. How does aggregation work in that case?
tProxy makes sure that each SV1 miner is assigned a unique extranonce_prefix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was that with a single TCP connection, we could manage around 2^32 channels, but these channels should point to the same endpoint and starting point. Each of these miners connects to the same endpoint (the translator proxy), but they each have different ports. Each server can only handle a limited number of TCP connections, which is nowhere near 2^32—more like on the order of 10^5. So, the aggregation doesn’t happen at the level of the translator proxy and the miner, but rather between the translator proxy and another upstream connection, since they will have a single TCP connection through which we can have 2^32 channels for each downstream miner node. However, this raises a concern: the translator proxy becomes a single point of failure, so the provider needs to implement scaling strategies to address that.
I see the practical limitation you are pointing out here. However, I think this is easily solvable by stacking multiple proxy layers.
This limitation only manifests if we try to connect more than 10^5 machines into one single proxy. But if they are distributed across multiple proxies, there's no issue.
And if we imagine there's one "funnel" proxy aggregating all of them, and then forwarding that into some upstream layer (maybe a pool or JDC), this upstream layer would still be able to have up to 2^32 channels in a single Connection.
Here's some comments about the practical aspects of this:
7ac6b1d
to
89eb9bc
Compare
0d08032
to
dfc1060
Compare
Aesthetic remark: As a reviewer, I'd appreciate if the diffs were organized in a way that allows for simple inspection of what has been added, what was changed and what removed. Possibly split it across multiple commits. This commit looks as if large chunks were replaced by a different text and reintroduced somewhere else, making it difficult to quickly validate. |
ccbd358
to
8379f73
Compare
thanks for your feedback @jakubtrnka here's some updates:
I would appreciate one second (final?) round of reviews, if possible |
In case the `*.png` files on the parent directory need to be adjusted, the `*.drawio.xml` files from this directory | ||
should be imported into the `draw.io` online tool for editing. | ||
|
||
After editing: | ||
- export the `.png` file to replace the old one in the parent directory | ||
- export the `.drawio.xml` file to replace the old one on this directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sjors there you go
note: before merging this PR I still need to update the indices with chapter headers on |
8379f73
to
d25faf4
Compare
img/extended_job.png
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment gave me some doubts about how accurate this image is
given that there's some subtle differences between coinbase_tx_prefix
and coinbase_prefix
, I wonder if the labels I put into coinbase_tx_prefix
and coinbase_tx_suffix
are correct
@TheBlueMatt can you please do a sanity check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For non-custom work that looks close to me. Might be worth noting coinbase_tx_prefix
also includes the coinbase tx version and input count bytes (so its not just a scriptSig
). The coinbase_tx_suffix
is similarly not just the scriptPubKey
, it includes the output count, multiple outputs (incl segwit commitment), value etc.
d6f69dd
to
e474030
Compare
Hey @jakubtrnka would be great to get your final review on this one. @Fi3 if you get a chance would be great to get your pair of eyes as well. 🙏 |
5415eb5
to
e10349e
Compare
e10349e
to
b04146b
Compare
close #96
cleanup and update Mining Protocol specs
The current state of the Mining Protocol Specs leave some Key Concepts left for the reader to infer.
Moreover, there are visual references with out-of-date information.
This PR aims to bring more clarity to the Mining Protocol Specs by:
New Key Concept definitions
Considerations
This PR does not aim to re-write the Mining Protocol Specs. All semantics should remain fully aligned with the original ideas coming from the Spec authors.
This PR does not aim to introduce big changes under 5.4 Mining Protocol Messages. Even though I belive this section deserves a cleanup in terms of formatting standards, I want to limit the scope of this PR to making things more clear with regards to conceptual clarity on the Mining Protocol Specs.
Some Key Concepts are referred to before they are formally introduced. This is a bit of a conceptual "chicken-and-egg" problem and I couldn't really find an optimal arrangement for the doc without making some compromises. I believe the original Spec authors faced similar challenges, and in the end of the day, we should expect the reader to go over multiple iterations of reading over the doc before getting a full grasp over the ideas being presented.
Reviewing diff lines on Markdown Syntax can be hard, especially when it comes to seeing the visual aids side-to-side with text. I would highly recommend reviewers to refer to this rendering on my fork.
Ideas for follow-up PRs