-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3s consensus - With consensus time included into consensus intervals. #3622
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: Christopher Schuchardt <[email protected]>
if (block_received_index + 1 == context.Block.Index) | ||
|
||
if (block_received_index + 1 == context.Block.Index && onPrepareBlockIndex + 1 == context.Block.Index) | ||
{ | ||
var diff = TimeProvider.Current.UtcNow - block_received_time; | ||
// Include the consensus time into the consensus intervals. | ||
var diff = TimeProvider.Current.UtcNow - onPrepareReceivedTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the major change to this pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in another PR and not here, if that is the case.
And for your notice, as I said in our meetings, this configuration has been tested in the past.
I think that you need more experimental tests before publishing things to quick and even writing new standards (that are not even necessary).
I am not saying that the change is terrible, @Jim8y, but there are some consequences, I think that you did not analyse the entire scenario before pushing this and creating standards with it.
I will surely be against this change here in this PR for a series of reasons.
Try this in a decentralized scenario or inserting delays (you can deal with that on docker) or create on AWS/AZURE in different locations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this configuration has been tested in the past.
Lots have changed in code since then. Would that affect anything like this? if so, we have to run these tests again to ensure its trueful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in another PR and not here, if that is the case.
And for your notice, as I said in our meetings, this configuration has been tested in the past.
I think that you need more experimental tests before publishing things to quick and even writing new standards (that are not even necessary). I am not saying that the change is terrible, @Jim8y, but there are some consequences, I think that you did not analyse the entire scenario before pushing this and creating standards with it.
I will surely be against this change here in this PR for a series of reasons. Try this in a decentralized scenario or inserting delays (you can deal with that on docker) or create on AWS/AZURE in different locations.
We have setup a real testnet to carry out stress test and stability test for weeks. The network works fine and no issue. As you said should be in another pr thing, all changes in this pr now is for the service of and only serve the 3-seconds consensus, it is meaningless to add any of the prs and test any of the prs that only contain part of the changes in this pr. Casue non of the code in this pr means a thing without rest part.
We encourage small prs for the intention of easier review and test, its not like everything must be as small as possible.
If there is no review issue and no real logic problem, we should not split a complete and already tested pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots have changed in code since then. Would that affect anything like this? if so, we have to run these tests again to ensure its trueful.
You are right @cschuchardt88 , I was also saying that in my last message. Let's see
[ContractMethod(CpuFee = 1 << 15, RequiredCallFlags = CallFlags.ReadStates)] | ||
public BigInteger GetGasPerBlock(DataCache snapshot) | ||
{ | ||
return GetSortedGasRecords(snapshot, Ledger.CurrentIndex(snapshot) + 1).First().GasPerBlock; | ||
} | ||
|
||
[ContractMethod(CpuFee = 1 << 15, RequiredCallFlags = CallFlags.ReadStates | CallFlags.WriteStates)] | ||
public BigInteger GetGasPerBlock(ApplicationEngine engine, DataCache snapshot) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shargon @roman-khimov please check this, i am try to update the gasperblock without really update it casue it should be a council thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can change the arguments without hardfork, because both arguments are native ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we should not write into storages here... should be done by council
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe CN can detect when this is made by council and start the new time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi shargon, i think my latest commit has addressed your concern:
- reuse existing GetGasPerBlock
- avoid write to the store but directly calculate a value
- ensures the council can still update the gasperblock, and the new value can be used immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UT updated accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intentially have two return currentgasperblolck to make the logic more clear as they are different situations. Code can be optimized if the logic passes review if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ContractMethod cannot have ApplicationEngine and DataCache parameters at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Most of the changes in this PR are just not needed at all (the only relevant is timer shift in fact).
- No hardfork is needed to change the block time in the current structure. It's just a matter of restarting consensus nodes with appropriate configuration.
- No native contract changes are required to change the GAS per block value. It's just a matter of sending a committee transaction. This one is really serious, btw, we do have a governance model and your changes just break it.
- Give 2 and 3 we can just schedule 3s block time to some post-Echidna date, create and send a transaction, restart CNs. This won't and can't be perfectly aligned, true, but I doubt it's a problem in practice. Changing governance assumptions behind the scenes is much bigger problem.
- But this still doesn't answer the time lock problem we can have with mainnet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are changes not necessary for this PR.
Some of them were extensive explored in the past with strong experiments. Thus, for their current implementation REAL tests and a MINIMAL care with running a real scenario is surely necessary.
This pr is already being running in real scenario for 2 weeks, stress test included, and of course it will also be tested in the testnet for weeks, but that will be after the merge of this pr. |
@Jim8y There are some problems with this change when we have change views, and more normally when the multiplied by factor is applied. It also have a cascade effect when a node has delays and did not still processed last commit, but, the one who processed and is the Speaker would already have sent next Prepare Request. Maybe other components on Neo Core improved, which is surely true. But in the past we where not able to run this scenario efficiently. I am not saying it is not possible, but my suggestion would surely be to merge these changes apart of the time change, which does not necessary need this in the first moment. |
I will try to run this in an abnormal scenario and check results. |
Sorry the testnet was setup and run for weeks with all information and result being shared in the core dev chatting group and TC chatting group. While the testnet was running, I encouraged everyone to run tests over the testnet. The testnet was shutted down after running for a few weeks due to high server cost. And before we bring changes in this pr to the mainnet, of course we will have a testnet for our newly released core version, so if you have any concern with the changes, you can still do some test in that testnet. |
I remember to see some results, was @superboyiii in charge of the simulation scenario? |
Yes, @superboyiii setup a 7 nodes testnet on the cloud servers, for a while when the stress test was running, one of the node is running on my local mac machine. That testnet ran for weeks and smoothly generate new blocks with 3-seconds consensus. Such test is very expensive,,, so not likely to do it again if there is no changes that can directly affect this pr. |
I finished tests on a high latency and selective distributed scenario where CN can not reach each other easily (sometime even not possible), and also with 1,5s delay in a 0,5s block generation period.
Change views occurs as expected and system keeps its stability.
Congratulations to the core devs for the general work, the consensus is much more stable as well as the P2P, my last simulation environment surprised myself. Regarding the PR itself, all changes looks good to me, but some modification are still needed before approval. |
@@ -35,6 +35,8 @@ private class Timer { public uint Height; public byte ViewNumber; } | |||
private readonly IActorRef blockchain; | |||
private ICancelable timer_token; | |||
private DateTime block_received_time; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be dropped now.
As previously discussed, modifying the blocktime could introduce risks to some projects. Specifically, projects that rely on the assumption of a fixed lock time for time-based calculations (vestings, timelocks, rewards, ...). For a smart contract to be impacted by this change, it must be aware of the current time in some way. I think this can only happen through one of the following methods (if there are other ways, suggestions are welcome):
A quick search for these patterns reveals some contracts that could be affected by this change. It would be advisable that we contact each project and evaluate the impact on a case-by-case basis. |
Description
Credits of this pr belong to @vncoelho . This pr duplicate changes in #3612 and added more corresponding update.
Changes in this pr are:
Fixes # (issue)
Type of change
Checklist: