Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use single codepath for sending transactions to a local and remote nodes #527

Merged
merged 7 commits into from
Jan 18, 2018

Conversation

dshulyak
Copy link
Contributor

@dshulyak dshulyak commented Jan 2, 2018

Fixes #529

Previously procedure for sending transactions was different based on
which type of node was used, local (light node) or remote.

This change removes separate code path for working with light node,
and the only difference will be backend for rpc client. In case of
light node all of the communication will be done over inproc dial.

Also there is a couple of related changes in this PR:

  • new EthereumTransactor that provides higher level API for working
    with ethereum network, and it is fully conformant with ethclient
  • new test rpc service that improves flexibility and coverage of
    txqueue manager tests

Closes: #529

@dshulyak dshulyak changed the title Use single codepath for sending transactions to a local and remote nodes [WIP] Use single codepath for sending transactions to a local and remote nodes Jan 2, 2018
@dshulyak dshulyak force-pushed the txmanager branch 4 times, most recently from 31f4cd7 to f4f5034 Compare January 2, 2018 14:50
@dshulyak dshulyak changed the title [WIP] Use single codepath for sending transactions to a local and remote nodes Use single codepath for sending transactions to a local and remote nodes Jan 2, 2018
@dshulyak
Copy link
Contributor Author

dshulyak commented Jan 2, 2018

@adambabik hello, i believe this change is ready for initial review.
I plan to add more tests for different edge cases and do other refactoring that is described in #338 , but i think it would be better to make separate PRs for those things.

@dshulyak
Copy link
Contributor Author

dshulyak commented Jan 2, 2018

Looks like i asked prematurely, e2e tests are failing with relevant error.

@dshulyak dshulyak force-pushed the txmanager branch 2 times, most recently from db1b531 to dcc0874 Compare January 2, 2018 16:02
@dshulyak
Copy link
Contributor Author

dshulyak commented Jan 2, 2018

At the moment only one test fails - TestCompleteMultipleQueuedTransactions, and I am pretty sure that I know the root cause of this issue. LesBackend internally uses txPool.GetNonce method when it sets the nonce for an incoming transaction. This method looks up a recent nonce at the local map of the pool - pool.nonce. But this method is not exported, and I have to use GetTransactionCount that is not consistent with txPool.GetNonce. I tried to verify it on develop branch:

tx txPool.GetNonce GetTransactionCount(pending)
1 0 0
2 1 0
3 2 0

I think I can add another interceptor, to a status rpc.Client, for GetTransactionCount method which will use LesBackend and GetPoolNonce. But I am pretty sure that it is a bug in LightEthereum client.

ethereum/go-ethereum#2880

@adambabik
Copy link
Contributor

Hey @dshulyak. Thank you for your contribution. I skimmed through the PR and it looks great! I will submit a detailed review tomorrow.

Regarding the nonce, in the current develop, the code that sends a transaction when using a remote node uses the same JSON-RPC method and arguments to get the nonce as before so the tests should pass just fine, unless these tests use a local node... In this case, a pool is used: https://github.com/ethereum/go-ethereum/blob/master/internal/ethapi/api.go#L1096-L1102 and this go-ethereum issue seems like a root cause. You can hot patch code in vendor/ and see if it fixes this problem. We can incorporate this patch if it will turn out to be necessary.

@dshulyak dshulyak force-pushed the txmanager branch 2 times, most recently from 316e0f8 to 10b69eb Compare January 3, 2018 08:26
@dshulyak
Copy link
Contributor Author

dshulyak commented Jan 3, 2018

@adambabik i don't see a reason why it won't be possible to reproduce this bug with upstream node. maybe it will be required to increase the number of concurrent transactions

@adambabik
Copy link
Contributor

@dshulyak you're right, this bug should be also visible when using a remote node. We may not have an automated test for the remote node, though.

To reviewers: the tests were failing because they use a local node and, before changes, they were using LES backend directly instead of the public API (that is eth_sendRawTransaction method). It turns out that the public API has a bug and does not use the transaction pool to return a correct nonce value.

@dshulyak
Copy link
Contributor Author

dshulyak commented Jan 3, 2018

I think that it is possible yo retry tx on "known transaction" error, both light and full nodes return it. But, i need to understand if there are any other side effects of doing it.

Copy link
Contributor

@adambabik adambabik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks great! I added a bunch of comments but they are minor suggestions or code style.

I will create a separate issue for this effort as it's related to #338 indeed, but not exactly.

)

// EthereumTransactor provides methods to create transactions for ethereum network.
type EthereumTransactor interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep it consistent, maybe we can call it EthTransactor, similarly to EthTxClient?


defaultGas = 90000

cancelTimeout = time.Minute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe defaultTimeout to match defaultGas naming convention?

args := queuedTx.Args

var gasPrice *big.Int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine but else can be omitted:

gasPrice := (*big.Int)(args.GasPrice)
if gasPrice == nil {
  // ...
}

})

err := txQueueManager.QueueTransaction(tx)
s.NoError(err)

var wg sync.WaitGroup
var mu sync.Mutex
completeTxErrors := make(map[error]int)
completeTxErrors := map[error]int{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be possible to get rid of this map and only assert errCompleteTransaction to be always nil.

// As we want to avoid network connection, we mock LES with a known error
// and treat as success.
s.nodeManagerMock.EXPECT().LightEthereumService().Return(nil, errTxAssumedSent)
nodeConfig, nodeErr := params.NewNodeConfig("/tmp", params.RopstenNetworkID, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this block repeats three times, maybe we can extract it to a separate TxQueueTestSuite method?

// FakePublicTxApi used to generate mock by mockgen util.
// This was done because PublicTransactionPoolAPI is located in internal/ethapi module
// and there is no easy way to generate mocks from internal modules.
type FakePublicTxApi interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can call it FakePublicTransactionPoolAPI to make it clearer what it is?


import (
context "context"
big "math/big"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is big required?

"github.com/ethereum/go-ethereum/common"
"github.com/ethereum/go-ethereum/common/hexutil"
"github.com/ethereum/go-ethereum/rpc"
gomock "github.com/golang/mock/gomock"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe gomock is not required before the package name here.

@@ -0,0 +1,91 @@
// Code generated by MockGen. DO NOT EDIT.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adambabik
Copy link
Contributor

@dshulyak I created a separate issue for this PR: #529.

@dshulyak
Copy link
Contributor Author

dshulyak commented Jan 3, 2018

@adambabik thanks for a review, i pushed an update

@andytudhope
Copy link

Hey @dshulyak! Adam suggested that we offer a bounty for this issue because you had done such good work. I have created and funded it and linked this PR to the issue with a minor edit. Could you please log into http://openbounty.status.im/ click on your profile picture, select to Update Your Payment Details and make sure to click the Update button after putting in an address you control?

@dshulyak
Copy link
Contributor Author

dshulyak commented Jan 3, 2018

that's awesome @andytudhope, I updated my address as you suggested

@dshulyak
Copy link
Contributor Author

dshulyak commented Jan 3, 2018

Can someone re-trigger a job on travis? It failed due to 'https://github.com/alecthomas/gometalinter/': Failed to connect to github.com port 443: Connection timed out

@divan
Copy link
Contributor

divan commented Jan 3, 2018

@dshulyak done.

Copy link
Contributor

@adambabik adambabik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about doing this review in two iterations but I found a bit more time just today.

Minor remarks overall, just please make sure to look at the comments relating to TestInvalidPassword and TestCompleteTransaction.

} else {
hash, err = m.completeLocalTransaction(queuedTx, password)
}
hash, err := m.completeTransaction(selectedAccount, queuedTx, password)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I noticed that maybe this arguments grouping could be better: m.completeTransaction(queuedTx, selectedAccount, password)

var emptyHash gethcommon.Hash

log.Info("verifying account password for transaction", "id", queuedTx.ID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this log just above m.accountManager.VerifyAccountPassword or remove at all.

@@ -81,31 +114,30 @@ func (s *TxQueueTestSuite) TestCompleteTransaction() {
s.NoError(err)

go func() {
_, errCompleteTransaction := txQueueManager.CompleteTransaction(tx.ID, TestConfig.Account1.Password)
s.Equal(errTxAssumedSent, errCompleteTransaction)
_, errCompleteTransaction := txQueueManager.CompleteTransaction(tx.ID, password)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After these changes, we do not expect any errors so we can verify if the returned hash is equal to tx.Hash, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, a few lines above, there is txQueueManager.SetTransactionReturnHandler(). I think we should also change it to expect err to be nil.

// Transaction should be already removed from the queue.
s.False(txQueueManager.TransactionQueue().Has(tx.ID))

// Wait for all CompleteTransaction calls.
wg.Wait()
s.Equal(completeTxErrors[errTxAssumedSent], 1)
s.Equal(1, completedTx, "only 1 tx expected to be completed")
s.Equal(txCount-1, inprogressTx, txCount-1, "txs expected to be reported as inprogress")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be:

s.Equal(txCount-1, inprogressTx, "txs expected to be reported as inprogress")

?

s.nodeManagerMock.EXPECT().LightEthereumService().Return(nil, keystore.ErrDecrypt)
nonce := hexutil.Uint64(10)
gas := hexutil.Big(*big.NewInt(defaultGas + 1))
s.setupTransactionPoolAPI(account, nonce, gas, keystore.ErrDecrypt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test does not test a real scenario. keystore.ErrDecrypt is returned by SendRawTransaction while it should be by VerifyAccountPassword. I would recommend to change that and also make sure that SendRawTransaction is not called.

})

err := txQueueManager.QueueTransaction(tx)
s.NoError(err)

var wg sync.WaitGroup
var mu sync.Mutex
completeTxErrors := make(map[error]int)
for i := 0; i < 3; i++ {
var completedTx int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: as there are multiple vars, it's beneficial to group them with var ().

@@ -956,6 +956,14 @@ func (s *PublicTransactionPoolAPI) GetRawTransactionByBlockHashAndIndex(ctx cont

// GetTransactionCount returns the number of transactions the given address has sent for the given block number
func (s *PublicTransactionPoolAPI) GetTransactionCount(ctx context.Context, address common.Address, blockNr rpc.BlockNumber) (*hexutil.Uint64, error) {
// go-ethereum issue https://github.com/ethereum/go-ethereum/issues/2880
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add this to go-ethereum patches. cc @divan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a way to get rid of it. we can maintain nonce locally for each address that status is using. increment it on every successful transaction, and reset in case if it was changed remotely. it is easy with new addr locker and will prevent issues both with upstream and local nodes.

will do a separate pr after additional testing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may introduce a lot of complexity into status-go and something that potentially can break in the future if go-ethereum implementation changes. But if you find some elegant solution that would be awesome.

I have seen this PR ethereum/go-ethereum#15794 and read the whole discussion in ethereum/go-ethereum#2880 and it looks like a separate JSON-RPC method is the best solution that can preserve backward compatibility of eth_getTransactionCount and allows to get the number of pending transactions from the pool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding a separate method also seemed like a good idea to me, but it may take some time until it will be merged and a new geth release will be available. and even then status app needs to be always aware that upstream node is not an old geth release or another client, e.g parity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that should be implemented via patches to keep track of our changes into the go-ethereum. This is blocked by #492, which will be merged tomorrow (internally agreed deadline).

@dshulyak dshulyak force-pushed the txmanager branch 4 times, most recently from 29c5534 to 55bcd13 Compare January 5, 2018 19:51
Copy link
Contributor

@adambabik adambabik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👏

select {
case <-c:
return nil
case <-timer.C:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case <-time.After(d):
  return ErrTimeout

will do the same.

@@ -265,9 +307,11 @@ func (s *TxQueueTestSuite) TestDiscardTransaction() {
err := txQueueManager.QueueTransaction(tx)
s.NoError(err)

w := make(chan struct{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of one char variables. In such contexts, done := make(chan struct{}) works pretty well and indicates an intention clearly.

@dshulyak
Copy link
Contributor Author

dshulyak commented Jan 8, 2018

thanks, I will change those things in one of the pr's that is based on this one.

travis failed because of #539

Copy link
Contributor

@divan divan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!
The modification to the vendored go-ethereum should be made through patches, so this PR is currently blocked by #492. Will try to resolve this block asap.

log.Info("default gas will be used. estimated gas", gas, "is lower than", defaultGas)
gas = big.NewInt(defaultGas)
gas := (*big.Int)(args.Gas)
if args.Gas == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good. The other way is to pass existing gas value to EstimateGas arguments and it will take it as upper bound, however, a lower value might be returned.

Skipping EstimateGas call if gas is defined should be better, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also the idea was to keep EthTxClient compatible with https://github.com/ethereum/go-ethereum/blob/master/ethclient/ethclient.go#L36 , it is not of high priority of course

@divan
Copy link
Contributor

divan commented Jan 17, 2018

As #492 has been merged, changes to internal/ethapi/api.go should be implemented as a patch (see https://github.com/status-im/status-go/tree/develop/geth-patches directory). @dshulyak when you have a time, could you make this change?

The exact way of applying patches to our fork might be changed in #551, but so far patch files are in geth-patches dir.

dshulyak and others added 6 commits January 17, 2018 20:12
Previously procedure for sending transactions was different based on
which type of node was used, local (light node) or remote.

This change removes separate code path for working with light node,
and the only difference will be backend for rpc client. In case of
light node all of the communication will be done over inproc dial.

Also there is a couple of related changes in this PR:
- new EthereumTransactor that provides higher level API for working
  with ethereum network, and it is fully conformant with ethclient
- new test rpc service that improves flexibility and coverage of
  txqueue manager tests
@dshulyak
Copy link
Contributor Author

@divan pushed the patch

@divan
Copy link
Contributor

divan commented Jan 17, 2018

@dshulyak perfect, thanks! Actually, I think we should not wait for #551, because the process of applying patches should be repeatable anyway, so we can add new patches meanwhile. Thus, unblocking this PR now.

@divan divan removed the blocked label Jan 17, 2018
@divan
Copy link
Contributor

divan commented Jan 18, 2018

Trying to understand why tests on Travis time outs before merging.

@divan divan merged commit 0771e7d into status-im:develop Jan 18, 2018
@divan
Copy link
Contributor

divan commented Jan 18, 2018

@dshulyak awesome job.

@dshulyak dshulyak deleted the txmanager branch January 18, 2018 16:57
@dshulyak dshulyak restored the txmanager branch January 18, 2018 16:57
dshulyak added a commit that referenced this pull request Feb 15, 2018
This change is not needed after the following PR was merged:
#527
dshulyak added a commit that referenced this pull request Feb 15, 2018
This change is not needed after the following PR was merged:
#527
pedropombeiro pushed a commit that referenced this pull request Feb 15, 2018
This change is not needed after the following PR was merged:
#527
pedropombeiro pushed a commit that referenced this pull request Feb 16, 2018
This change is not needed after the following PR was merged:
#527
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants