Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow up EOS issue 2204: check block before pushing to fork db #831

Open
abitmore opened this issue Apr 7, 2018 · 13 comments
Open

Follow up EOS issue 2204: check block before pushing to fork db #831

abitmore opened this issue Apr 7, 2018 · 13 comments
Labels
1b User Story The User Story details a requirement. It may reference a parent Epic. It may reference child Task(s) 2b Gathering Requirements Status indicating currently refining User Stories and defining Requirements 3c Enhancement Classification indicating a change to the functionality of the existing imlementation 4b Normal Priority Priority indicating the moderate impact to system/user -OR- existing workaround is costly to perform 6 Security Impact flag identifying system/user security security

Comments

@abitmore
Copy link
Member

abitmore commented Apr 7, 2018

EOSIO/eos#2204:

Right now we only check if a received block is linkable to an existing block in fork_db but do no other checks. If the network code also doesn't reject multiple blocks at the same block height from the same node, it opens itself up to an easy DOS attack.

Rather than relying only on the network code to protect against such attacks, we should do some light validation of the received blocks (particularly enough to check that it was correctly signed by an active producer from the perspective of that block) before including it into the fork_db.

@abitmore
Copy link
Member Author

abitmore commented May 7, 2018

EOSIO/eos#2718 is related.

@jmjatlanta
Copy link
Contributor

And here I thought I was taking on an easy fix. It seems that PR #884 opened a discussion with many moving pieces. I am hoping to move that discussion here, to help all find the best solution.

I am generalizing here, and must admit I only have a cursory understanding of the issues involved.

EOS 2204 adds a check to make sure that the witness is an active witness before adding its block to the chain.

EOS 2718 adds a little more data to the header to help get the 2/3+1 majority back, in the case of forks that have diverged and contain subsets of witness approvals that would otherwise require manual intervention to merge.

(1) Can we implement 2204 without 2718? And if so, (1b) does that open the door to an easier DOS than what we have now?

(2) Can a comprehensive fix (i.e. DPoS 3.0) be completed in a short enough time frame to improve security and network stability, or would incremental fixes be the better option?

@abitmore @pmconrad I will pause on PR #884 unless it is deemed worthwhile to continue.

Relevant: #884 (comment)

@pmconrad
Copy link
Contributor

pmconrad commented May 11, 2018

AFAICS EOS 2204 adds that check, except for the first block in a round. That means that as a DOS protection measure the fix is useless (the DOS can still be executed at the start of each round).

EOS 2718 is a protection against a witness signing blocks on two sides of a fork simultaneously, which could lead to multiple forks having a 2/3+1 majority. I think the solution is incomplete (insofar as it does not take timing effects into account), and it also doesn't solve the problem that the list of active witnesses can be different in each side of the fork.

I agree about pausing this, because I don't see an effective short-term solution.

@jmjatlanta jmjatlanta removed their assignment May 16, 2018
@abitmore
Copy link
Member Author

There is an interesting discussion in Steem which is related to this issue: steemit/steem#2471

@abitmore abitmore added this to the Future Non-Consensus-Changing Release milestone May 21, 2018
@ryanRfox ryanRfox added 6 Security Impact flag identifying system/user security 1b User Story The User Story details a requirement. It may reference a parent Epic. It may reference child Task(s) 2b Gathering Requirements Status indicating currently refining User Stories and defining Requirements 3c Enhancement Classification indicating a change to the functionality of the existing imlementation labels May 24, 2018
@abitmore
Copy link
Member Author

Perhaps it's mentioned elsewhere: there is a DoS factor that an old block will be pushed to fork db without being validated as long as

  1. its previous is in fork db, and
  2. it's not too old.

@abitmore abitmore changed the title Follow up EOS issue 2204 Follow up EOS issue 2204: check block before pushing to fork db Aug 29, 2018
@abitmore
Copy link
Member Author

There is a scenario described in steemit/steem#2911. When there is a long reversible fork, node performance got heavily hit when tries to push another fork which is even longer.

@ryanRfox ryanRfox added the 4b Normal Priority Priority indicating the moderate impact to system/user -OR- existing workaround is costly to perform label Feb 12, 2019
pmconrad added a commit that referenced this issue Sep 15, 2019
Fix #831 - check witness signature before adding block to fork db
@pmconrad
Copy link
Contributor

Fixed by #1987

@abitmore
Copy link
Member Author

abitmore commented Sep 17, 2019

On top of #1987, I think we can add a max-blocks-per-block-height limitation to fork_db to get even better protection. Due to the protection added in #1987, 10 would be big enough for the new limit.

@abitmore abitmore reopened this Sep 17, 2019
@pmconrad
Copy link
Contributor

pmconrad commented Sep 17, 2019

IMO a better idea would be to limit fork_db to one block per witness per block height. That would automatically limit the blocks per height to the number of witnesses, without the potential for a DOS attack.

@abitmore
Copy link
Member Author

one block per witness per block height

With #1987, one attack vector is a legit active witness signing lots of blocks, which can be a) on the same height, or b) on different heights but with the same timestamp so they would still pass the verify_signing_witness() check. In this case, the solution is to have one block per witness per timestamp, but not only one block per witness per block height. When the attack is detected, voters can vote down the misbehaving witness to stop the attack after the next maintenance interval.

Another attack vector is when there is a minor fork created by a legit witness, since the first block would pass verify_signing_witness() but not be fully evaluated in time, an attacker can broadcast lots of blocks which link to the first block to a victim node, to bloat the victim node's fork db. The attacker can even create fake forks which are longer than the main fork, which would cause the victim node to pop and re-push blocks on the main fork again and again. This attack combined with the first attack vector would make the situation much worse.

To mitigate influence of the 2nd attack vector,

  • firstly, we need to make sure to NOT propagate a block if it's not fully evaluated;
  • secondly, to protect the fork db from being bloated too hard, it's good to have a limit on the total number of unverified blocks in the fork db. In case when there are legit long forks, the size (actually, height) of the fork db would increase as well, so a limit on the number of blocks per height would help. I think a fixed small number (E.G. 10 or even smaller) would be practically better than a dynamic number (E.G. number of witnesses or of active witnesses);
  • finally, if the above 2 methods are in place, although a pop-push attack is still possible, the attacker would be disconnected right after the victim node found a block received from the attacker is invalid, so the attack would be mitigated and the attack surface would shrink when pushing normal blocks received from other peers over time.

@pmconrad
Copy link
Contributor

the solution is to have one block per witness per timestamp

The problem here is that this would open up a different way for a DOS: if a witness spams the network with different blocks for the same timestamp, the nodes who receive the "wrong" blocks first will be unable to switch to the majority chain. I think the only thing we can do about a malicious witness is detect and log different blocks with the same timestamp (more precisely: with the same slot time).

For the other attack vector, the height of a fork is not that much of a problem. Once the fake fork outgrows the majority fork the node will try to switch to it. At that point, the legitimate block will have its scheduled_witnesses populated, and from then on the attacker can no longer push blocks on top of it.

So in order to bloat a node's fork db an attacker will have to push many blocks on top of a minority fork in breadth-first fashion. The only real protection against that is to evaluate the legitimate block, as you suggest, so the fake blocks will be recognized. Any absolute limit on the number of blocks in fork_database creates a DOS attack vector, see above.

Since switching to a minority fork is expensive, we can use an absolute number > 1 of unverified blocks as a threshold for temporarily switching to a minority fork, just so the scheduled_witnesses field gets filled. (We can avoid some of the cost of the switch by fork()ing the node process, performing the switch in the child process, and communicating the scheduled_witnesses back to the parent. That way the main process could continue to run as usual, and we would avoid the cost of switching back to the majority fork because the child can simply terminate. fork is quite efficient on Linux, but I don't know about Mac/Windows.)

@abitmore
Copy link
Member Author

abitmore commented Sep 18, 2019

DOS: if a witness spams the network with different blocks for the same timestamp, the nodes who receive the "wrong" blocks first will be unable to switch to the majority chain.

Good catch. There are 2 scenarios here.

  1. A rogue witness produced multiple blocks with different height but the same aslot (timestamp). In this case, a potential solution is to keep only the block with the highest height.

  2. A rogue witness (W1) produced multiple blocks with the same height as well as the same aslot (timestamp), aka double-producing. Let's name the blocks (B1 .. Bn). The next witness (W2) would have built a new block B1-1 on top of B1. The next witness (W3) might received the other blocks (B2-Bn) before B1, when received B1-1, if B1 was refused, it can't switch to B1-1, so it may produce a block on top of B2, aka forking.
    It's a tricky situation. Ideally, to punish W1 due to its misbehavior, W2 should refuse to build blocks on top of any of (B1 .. Bn), and W3 should build blocks on top of W2's new block. However, if not implemented appropriately, due to network latency issues, it's highly possible that the chain will be forking badly when such attack occurs.

I think the only thing we can do about a malicious witness is detect and log different blocks with the same timestamp (more precisely: with the same slot time).

I agree, although we can also look for better solutions in the meanwhile.

fork()ing the node process

This is an interesting idea, although I guess it's not easy to do for our use case. One concern is popping blocks from the major fork would update quite some memory segments, which would need a copy in the child process. It means we need more RAM to run the node.

@abitmore
Copy link
Member Author

Keeping this issue open for future improvement. See discussions above.

@abitmore abitmore reopened this Jun 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1b User Story The User Story details a requirement. It may reference a parent Epic. It may reference child Task(s) 2b Gathering Requirements Status indicating currently refining User Stories and defining Requirements 3c Enhancement Classification indicating a change to the functionality of the existing imlementation 4b Normal Priority Priority indicating the moderate impact to system/user -OR- existing workaround is costly to perform 6 Security Impact flag identifying system/user security security
Projects
None yet
Development

No branches or pull requests

4 participants