cleanup and update Mining Protocol specs #98

plebhash · 2024-09-05T21:46:02Z

close #96

cleanup and update Mining Protocol specs

The current state of the Mining Protocol Specs leave some Key Concepts left for the reader to infer.

Moreover, there are visual references with out-of-date information.

This PR aims to bring more clarity to the Mining Protocol Specs by:

explicitly defining some Key Concepts
providing some up-to-date visual aids

New Key Concept definitions

Job (Standard, Extended, Future, Custom)
Extended Extranonce

Considerations

This PR does not aim to re-write the Mining Protocol Specs. All semantics should remain fully aligned with the original ideas coming from the Spec authors.
This PR does not aim to introduce big changes under 5.4 Mining Protocol Messages. Even though I belive this section deserves a cleanup in terms of formatting standards, I want to limit the scope of this PR to making things more clear with regards to conceptual clarity on the Mining Protocol Specs.
Some Key Concepts are referred to before they are formally introduced. This is a bit of a conceptual "chicken-and-egg" problem and I couldn't really find an optimal arrangement for the doc without making some compromises. I believe the original Spec authors faced similar challenges, and in the end of the day, we should expect the reader to go over multiple iterations of reading over the doc before getting a full grasp over the ideas being presented.
Reviewing diff lines on Markdown Syntax can be hard, especially when it comes to seeing the visual aids side-to-side with text. I would highly recommend reviewers to refer to this rendering on my fork.

Ideas for follow-up PRs

implement formatting standards across the entire doc (hopefully in congruence with all Markdown files under this repository).
add some flowcharts illustrating Message flows across different Roles under the Mining Protocol.

Sjors · 2024-09-06T07:13:55Z

I find the translator diagram a bit too visually cluttered.

Since the information for the second and third device is duplicated, you could just leave it out:

Plus a small visual hint that the pattern is repeated.

Sjors · 2024-09-06T07:14:54Z

It would also be useful to add the .dot files you used to generate these graphics, so they can be more easily modified in the future.

GitGab19

Great work @plebhash, I really like it! 👏
I left some comments, and questions which can be topics of some discussion

GitGab19 · 2024-09-06T09:17:01Z

05-Mining-Protocol.md


-The size of search space for an extended channel is `2^(NONCE_BITS+VERSION_ROLLING_BITS+extranonce_size*8)` per `nTime` value.
+The size of the search space for one Standard Job, given a fixed `nTime` field, is `2^(NONCE_BITS + BIP320_VERSION_ROLLING_BITS) = ~280Th`, where `NONCE_BITS = 32` and `BIP320_VERSION_ROLLING_BITS = 16`.


I like more the phrase:
"The size of the search space for one Standard Job is 2^(NONCE_BITS + BIP320_VERSION_ROLLING_BITS) = ~280Th, per nTime value, where NONCE_BITS = 32 and BIP320_VERSION_ROLLING_BITS = 16."

According to this, does it mean we should think about the fact that newer machines are already surpassing this 280Th/s limit? (look at S21 hydro or S21 XP hydro)

HOM is not intended for powerful machines.

hmm so this phrase is wrong?

All SV2 mining devices are restricted to Standard Jobs. This is a big difference from legacy SV1 mining devices, where rolling extranonces is a common feature.

I wrote it based on my current understanding, and after reaching out to Braiins Firmware Support (Marin) and getting confirmation that SV2 Braiins Firmware is restricted to HOM.

Can some powerful SV2 Mining Device open Extended Channels and receive NewExtendedMiningJob?

this was a bit confusing to me, because it made me think: "what is the purpose of HOM Standard Jobs after all?"

and based on this question #98 (comment), I believe @GitGab19 also shares this feeling

I thought about @Fi3 answer and here's my current perspective on the trade-offs of Standard Jobs:

rolling Extranonces is an "advanced feature" that makes firmware more complex and harder to maintain, so from a business perspective, it should be avoidable if possible.

so a way to correct the phrase above would be:

All SV2 mining devices with hashrate under 280 TH/s should be restricted to HOM via Standard Jobs. This is a big difference from legacy SV1 mining devices, where rolling extranonces is a common feature. Hashing space optimization should be achieved upstream, where efficient telemetry is achieved by keeping track of multiple Standard Jobs via Group Channels.

If the SV2 mining device hashrate is above 280 TH/s, then it should support non-HOM via Extended Jobs.

hopefully the spec authors can confirm this in the near future (when we ask them to review this PR)

I think HOM can be used for almost any machine, provided it doesn't exceed the hash rate by orders of magnitude.

The device is supposed to receive new mining jobs periodically. Every new job resets the ntime value again to the min_ntime for a given job. So unless the time doesn't roll hours into the future within the usual 30 second window, I don't think it's a problem.

Problem with rolling nTime too much ahead of the true time is equivalent to a problem of "not receiving a new mining job frequently enough".
It's difficult to draw some hard line. It's probably better to be a bit vague about that. Say something like "be sane about nTime rolling or otherwise your work will be rejected".

quick calculation:
Assume we allow nTime to roll 10 minutes into the future for a given mining job that is being received every 30 seconds. In order to get to that point we would need 5.6 PH/s (=280 * 600 / 30). I don't think we are going to see such devices in foreseeable future (simple device controlled by a single controller - more complex devices split across multiple machines need to open an extended channel).

thanks for the clarification!

I'll tag @rrybarczyk here since this topic was part of a discussion we had not long ago, while we went through this PR.

When we had this discussion, my understanding was that HOM over Standard Jobs were designed to cater for Mining Devices with hashrate below 280 TH/s. This clarifies that this is not the case at all. HOM is perfectly suitable for almost any kind of Mining Device, as long as jobs are updated frequently.

But now that leaves me wondering: is there any scenario where Mining Devices would open Extended Channels with an upstream?

GitGab19 · 2024-09-06T09:25:28Z

05-Mining-Protocol.md

+A Mining Proxy can forward multiple Standard Channels to the Upstream, ideally one representing each SV2 Mining Device Downstream.
+The Upstream (Pool, JDC or another Mining Proxy) is expected to send a `SetGroupChannel` message aggregating different Standard Channels.


Is the SetGroupChannel triggered by some specific behaviour?
Or the trigger to send it can be arbitrary defined by pools?

the main criteria I can think is: all Standard Channels under the same Connection are unified under the same Group Channel

but maybe the spec authors can elaborate on some other scenarios

GitGab19 · 2024-09-06T09:27:50Z

05-Mining-Protocol.md

+The Upstream (Pool, JDC or another Mining Proxy) is expected to send a `SetGroupChannel` message aggregating different Standard Channels.
+The Upstream sends jobs via `NewExtendedMiningJob` to Group Channels, and if some Downstream is a Mining Device (i.e.: the `SetupConnection` had the `REQUIRES_STANDARD_JOBS` flag) the Mining Proxy converts that message into different `NewMiningJob` messages after calculating the correct Merkle Root based on each Standard Channel's `extranonce_prefix`.
+
+Alternatively, a Mining Proxy could aggregate multiple Donwstream Standard Channels into a single Extended Channel. The Upstream sends Jobs via `NewExtendedMiningJob`, and for each Downstream SV2 Mining Device, the Mining Proxy sends different `NewMiningJob` message, where the Merkle Root is based on the Standard Channel's `extranonce_prefix`.


What could be the reason why a proxy prefers to open many standard channels (with upstream), which are then grouped in a group channel? Why that proxy is not directly opening an extended channel with upstream?

I asked a similar question when I was looking at some implementation details

stratum-mining/stratum#1145 (reply in thread)

@Fi3 answer was:

it might want to group standard channels from downstream, maybe a pool want to go that way to have telemetry or for whatever other reasons. But you should ask the spec's authors.

Hopefully we will get some reviews by the spec authors here soon, so they could also help further clarify this in more detail.

plebhash · 2024-09-06T11:54:17Z

It would also be useful to add the .dot files you used to generate these graphics, so they can be more easily modified in the future.

The graphics were generated via draw.io. The closest I could do to your suggestion is to export and commit .drawio.xml files, which could be later imported back into the tool and further edited.

I actually think this is a good idea.

Sjors · 2024-09-06T14:00:04Z

commit drawio.xml

That's a good start.

plebhash · 2024-09-17T18:52:16Z

review by @mcelrath

plebhash/sv2-spec@update-mining-protocol...mcelrath:sv2-spec:patch-1

dropping this link here so I can dive into it later

Shourya742 · 2024-09-18T10:36:30Z

05-Mining-Protocol.md


-### 5.1.4 Future Jobs
+- The `extranonce_prefix` bytes are reserved for the upstream layer, where fixed bytes were already established for the Extranonce and are no longer available for rolling or search space splitting.


Could we get an explanation of why they are needed, what purpose they serve, and how they are divided?

Given that the extended extranonce field is an array of 32 bytes, let's assume we split it into x, y, and z. The downstream nodes use standard channels, and with one group channel, we can accommodate up to 2^32 standard channels. However, since x is non-zero and y + z < 32, we won't be able to support the maximum number of downstream channels.

Could we get an explanation of why they are needed, what purpose they serve, and how they are divided?

do you mean Extended Extranonces as a whole, or extranonce_prefix specifically?

Given that the extended extranonce field is an array of 32 bytes, let's assume we split it into x, y, and z. The downstream nodes use standard channels, and with one group channel, we can accommodate up to 2^32 standard channels. However, since x is non-zero and y + z < 32, we won't be able to support the maximum number of downstream channels.

the wording on 2^32 channels per connection says:

There can theoretically be up to 2^32 open Channels within one Connection.

up to means this is a theoretical ceiling, but it does not imply this capacity will be filled every time... this should be interpreted as "there will never be more than 2^32 Channels per Connection" rather than "there can always be 2^32 Channels per Connection"

the only layer where the 2^32 capacity could ever be used is the most upstream layer (Poor or JDC), and by definition there's no fixed extranonce_prefix bits on this layer

but thanks for the feedback, I updated this phrase to make it more clear:

There can theoretically be up to 2^32 open Channels within one Connection. This is however just a theoretical ceiling, and it does not mean that every Connection will be able to fill this full capacity (maybe the search space has already been narrowed).

Shourya742 · 2024-09-18T10:49:42Z

05-Mining-Protocol.md

-An empty future block job or speculated non-empty job can be sent in advance to speedup new mining job distribution.
-The point is that the mining server MAY have precomputed such a job and is able to pre-distribute it for all active channels.
-The only missing information to start to mine on the new block is the new prevhash.
+![](./img/extended_extranonce_proxies.png)


This image might be misleading for the pure SV2 side, as it ends the flow with the merkle_root. Can we have something like no rolling merkle root

When transitioning from one proxy to another, are we concatenating the extranonce_prefix and the local reserved portion, or are we hashing them?

This image might be misleading for the pure SV2 side, as it ends the flow with the merkle_root. Can we have something like no rolling merkle root

I agree that was misleading. Changed the Mining Device layer to extranonce_prefix, because that is was is actually assigned to each Standard Channel.

Only later, when Standard Jobs are created, the unique Merkle Root is calculated (coinbase_prefix+extranonce_prefix+coinbase_suffix)

When transitioning from one proxy to another, are we concatenating the extranonce_prefix and the local reserved portion, or are we hashing them?

we concatenate.

Shourya742 · 2024-09-18T10:57:48Z

05-Mining-Protocol.md

+- `extranonce_size`: how many bytes are available for the locally reserved and downstream reserved areas of the Extended Extranonce.
+
+And a [Standard Channel](#531-standard-channels) has the following property:
+- `extranonce_prefix` the Extended Extranonce bytes that were already allocated by the upstream server.


Why does a standard channel even need extra nonce information if it's only used for header-only mining?

you have less messages to send from the pool to the proxy

multiple standard channels are aggregated into a group channel

when an ExtendedMiningJob message comes into the group channel, it needs to be converted it into multiple NewMiningJob messages

each NewMiningJob message needs to have a unique Merkle Root, otherwise there would be collision in the search space

the uniqueness of each Merkle Root is determined by the static extranonce_prefix of each standard channel (by combining it with coinbase_prefix and coinbase_suffix from the Extended Job that came from upstream)

overall, the trick to assimilate this is to think in terms of proxy layers: usually, there are multiple layers of extended jobs coming from upstream, and only on the last layer they are converted into standard jobs, where Merkle Roots have to be unique

05-Mining-Protocol.md

Shourya742 · 2024-09-18T11:41:18Z

05-Mining-Protocol.md

+Each Channel identifies a dedicated mining session associated with an authorized user.
+Upstream stratum nodes accept work submissions and specify a mining target on a per-channel basis.
+
+There can theoretically be up to `2^32` open Channels within one Connection.


Could you please point me to the part of the codebase where we are implementing the multiplexing?

I'm not sure what you mean by multiplexing here.

but maybe the channel_logic module of roles_logic_sv2 could provide you some insight (as well as its usage on roles crates)

05-Mining-Protocol.md

Shourya742 · 2024-09-18T12:19:41Z

05-Mining-Protocol.md

+
+A proxy can either transparently allow its clients to open separate Channels with the server (preferred behavior), or aggregate open connections from downstream devices into its own open channel with the server and translate the messages accordingly (present mainly for allowing v1 proxies).
+Both options have some practical use cases.
+In either case, proxies SHOULD aggregate clients' Channels into a smaller number of Connections.


None of the SV1 miners are going to have a standard channel since they require a non-homogeneous setup. How does aggregation work in that case? At the same time, I have this question: each of these miners (clients) connects to the translator proxy, and each connection is a TCP connection. So how are we achieving multiplexing? My understanding was that with a single TCP connection, we could manage around 2^32 channels, but these channels should point to the same endpoint and starting point. Each of these miners connects to the same endpoint (the translator proxy), but they each have different ports. Each server can only handle a limited number of TCP connections, which is nowhere near 2^32—more like on the order of 10^5. So, the aggregation doesn’t happen at the level of the translator proxy and the miner, but rather between the translator proxy and another upstream connection, since they will have a single TCP connection through which we can have 2^32 channels for each downstream miner node. However, this raises a concern: the translator proxy becomes a single point of failure, so the provider needs to implement scaling strategies to address that.

None of the SV1 miners are going to have a standard channel since they require a non-homogeneous setup. How does aggregation work in that case?

tProxy makes sure that each SV1 miner is assigned a unique extranonce_prefix

My understanding was that with a single TCP connection, we could manage around 2^32 channels, but these channels should point to the same endpoint and starting point. Each of these miners connects to the same endpoint (the translator proxy), but they each have different ports. Each server can only handle a limited number of TCP connections, which is nowhere near 2^32—more like on the order of 10^5. So, the aggregation doesn’t happen at the level of the translator proxy and the miner, but rather between the translator proxy and another upstream connection, since they will have a single TCP connection through which we can have 2^32 channels for each downstream miner node. However, this raises a concern: the translator proxy becomes a single point of failure, so the provider needs to implement scaling strategies to address that.

I see the practical limitation you are pointing out here. However, I think this is easily solvable by stacking multiple proxy layers.

This limitation only manifests if we try to connect more than 10^5 machines into one single proxy. But if they are distributed across multiple proxies, there's no issue.

And if we imagine there's one "funnel" proxy aggregating all of them, and then forwarding that into some upstream layer (maybe a pool or JDC), this upstream layer would still be able to have up to 2^32 channels in a single Connection.

Here's some comments about the practical aspects of this:

https://github.com/plebhash/sv2-spec/blob/75e7c8c42a64b151d540ee1fd9f32d590ea37f26/02-Design-Goals.md?plain=1#L54

https://github.com/plebhash/sv2-spec/blob/75e7c8c42a64b151d540ee1fd9f32d590ea37f26/10-Discussion.md?plain=1#L23

jakubtrnka · 2024-10-21T08:51:38Z

Aesthetic remark: As a reviewer, I'd appreciate if the diffs were organized in a way that allows for simple inspection of what has been added, what was changed and what removed. Possibly split it across multiple commits. This commit looks as if large chunks were replaced by a different text and reintroduced somewhere else, making it difficult to quickly validate.

05-Mining-Protocol.md

plebhash · 2024-10-28T21:10:53Z

Aesthetic remark: As a reviewer, I'd appreciate if the diffs were organized in a way that allows for simple inspection of what has been added, what was changed and what removed. Possibly split it across multiple commits. This commit looks as if large chunks were replaced by a different text and reintroduced somewhere else, making it difficult to quickly validate.

thanks for your feedback @jakubtrnka

here's some updates:

I re-organized the commits to make the review process a bit easier
I moved the Proxy section to [WIP] add Proxy Annex #103
I addressed your comments around Jobs Channels

I would appreciate one second (final?) round of reviews, if possible

plebhash · 2024-10-28T21:12:12Z

img/drawio/README.md

+In case the `*.png` files on the parent directory need to be adjusted, the `*.drawio.xml` files from this directory 
+should be imported into the `draw.io` online tool for editing.
+
+After editing:
+- export the `.png` file to replace the old one in the parent directory
+- export the `.drawio.xml` file to replace the old one on this directory


@Sjors there you go

#98 (comment)

plebhash · 2024-10-29T22:03:00Z

note: before merging this PR I still need to update the indices with chapter headers on README.md

plebhash · 2024-10-29T22:37:51Z

img/extended_job.png

this comment gave me some doubts about how accurate this image is

given that there's some subtle differences between coinbase_tx_prefix and coinbase_prefix, I wonder if the labels I put into coinbase_tx_prefix and coinbase_tx_suffix are correct

@TheBlueMatt can you please do a sanity check here?

For non-custom work that looks close to me. Might be worth noting coinbase_tx_prefix also includes the coinbase tx version and input count bytes (so its not just a scriptSig). The coinbase_tx_suffix is similarly not just the scriptPubKey, it includes the output count, multiple outputs (incl segwit commitment), value etc.

pavlenex · 2024-11-05T17:06:47Z

Hey @jakubtrnka would be great to get your final review on this one. @Fi3 if you get a chance would be great to get your pair of eyes as well. 🙏

05-Mining-Protocol.md

plebhash force-pushed the update-mining-protocol branch 6 times, most recently from 11ae8cd to 6bb2c1c Compare September 5, 2024 22:21

plebhash changed the title ~~cleanup Mining Protocol specs~~ cleanup and update Mining Protocol specs Sep 5, 2024

plebhash force-pushed the update-mining-protocol branch 3 times, most recently from e481ac2 to 3e986cd Compare September 5, 2024 23:50

GitGab19 reviewed Sep 6, 2024

View reviewed changes

plebhash mentioned this pull request Sep 6, 2024

Review lorbax/pplns-with-job-declaration#1

Open

plebhash force-pushed the update-mining-protocol branch 7 times, most recently from 69d2474 to 2c058bf Compare September 10, 2024 20:01

rrybarczyk self-requested a review September 17, 2024 14:55

Shourya742 reviewed Sep 18, 2024

View reviewed changes

plebhash force-pushed the update-mining-protocol branch 4 times, most recently from 7ac6b1d to 89eb9bc Compare September 26, 2024 15:13

plebhash force-pushed the update-mining-protocol branch 3 times, most recently from 0d08032 to dfc1060 Compare October 14, 2024 15:29

jakubtrnka reviewed Oct 21, 2024

View reviewed changes

05-Mining-Protocol.md Outdated Show resolved Hide resolved

jakubtrnka reviewed Oct 21, 2024

View reviewed changes

05-Mining-Protocol.md Outdated Show resolved Hide resolved

05-Mining-Protocol.md Outdated Show resolved Hide resolved

plebhash added 2 commits October 22, 2024 10:05

add mining protocol chapter intro

aee7bdb

rm outdated mining protocol images

b109ab9

This was referenced Oct 25, 2024

Annex introducing definition for Proxy #102

Open

[WIP] add Proxy Annex #103

Draft

plebhash force-pushed the update-mining-protocol branch 2 times, most recently from ccbd358 to 8379f73 Compare October 28, 2024 21:04

plebhash commented Oct 28, 2024

View reviewed changes

plebhash force-pushed the update-mining-protocol branch from 8379f73 to d25faf4 Compare October 29, 2024 22:37

plebhash commented Oct 29, 2024

View reviewed changes

plebhash force-pushed the update-mining-protocol branch 2 times, most recently from d6f69dd to e474030 Compare November 1, 2024 20:07

pavlenex requested a review from jakubtrnka November 5, 2024 17:06

plebhash commented Nov 11, 2024

View reviewed changes

05-Mining-Protocol.md Outdated Show resolved Hide resolved

plebhash commented Nov 11, 2024

View reviewed changes

05-Mining-Protocol.md Outdated Show resolved Hide resolved

plebhash force-pushed the update-mining-protocol branch 3 times, most recently from 5415eb5 to e10349e Compare November 16, 2024 00:15

plebhash added 3 commits November 26, 2024 07:37

introduce job definition

9aa9033

clarify channels definition

95ed875

fix flag terminology on table headers

b04146b

plebhash force-pushed the update-mining-protocol branch from e10349e to b04146b Compare November 26, 2024 02:08


		The size of search space for an extended channel is `2^(NONCE_BITS+VERSION_ROLLING_BITS+extranonce_size*8)` per `nTime` value.
		The size of the search space for one Standard Job, given a fixed `nTime` field, is `2^(NONCE_BITS + BIP320_VERSION_ROLLING_BITS) = ~280Th`, where `NONCE_BITS = 32` and `BIP320_VERSION_ROLLING_BITS = 16`.

		A Mining Proxy can forward multiple Standard Channels to the Upstream, ideally one representing each SV2 Mining Device Downstream.
		The Upstream (Pool, JDC or another Mining Proxy) is expected to send a `SetGroupChannel` message aggregating different Standard Channels.


		### 5.1.4 Future Jobs
		- The `extranonce_prefix` bytes are reserved for the upstream layer, where fixed bytes were already established for the Extranonce and are no longer available for rolling or search space splitting.

cleanup and update Mining Protocol specs #98

Are you sure you want to change the base?

cleanup and update Mining Protocol specs #98

Conversation

plebhash commented Sep 5, 2024 • edited Loading

cleanup and update Mining Protocol specs

New Key Concept definitions

Considerations

Ideas for follow-up PRs

Sjors commented Sep 6, 2024

Sjors commented Sep 6, 2024

GitGab19 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fi3 Sep 6, 2024 • edited Loading

Choose a reason for hiding this comment

plebhash Sep 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plebhash Sep 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plebhash Oct 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plebhash Sep 6, 2024 • edited Loading

Choose a reason for hiding this comment

plebhash commented Sep 6, 2024 • edited Loading

Sjors commented Sep 6, 2024

plebhash commented Sep 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plebhash Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fi3 Sep 18, 2024 • edited Loading

Choose a reason for hiding this comment

plebhash Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plebhash Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

jakubtrnka commented Oct 21, 2024 • edited Loading

plebhash commented Oct 28, 2024

Choose a reason for hiding this comment

plebhash commented Oct 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pavlenex commented Nov 5, 2024

plebhash commented Sep 5, 2024 •

edited

Loading

GitGab19 left a comment •

edited

Loading

Fi3 Sep 6, 2024 •

edited

Loading

plebhash Sep 6, 2024 •

edited

Loading

plebhash Sep 7, 2024 •

edited

Loading

plebhash Oct 26, 2024 •

edited

Loading

plebhash Sep 6, 2024 •

edited

Loading

plebhash commented Sep 6, 2024 •

edited

Loading

plebhash Sep 26, 2024 •

edited

Loading

Fi3 Sep 18, 2024 •

edited

Loading

plebhash Sep 26, 2024 •

edited

Loading

plebhash Sep 26, 2024 •

edited

Loading

jakubtrnka commented Oct 21, 2024 •

edited

Loading