Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get and set account is kind of time consuming #2652

Closed
yutianwu opened this issue Nov 1, 2018 · 10 comments
Closed

Get and set account is kind of time consuming #2652

yutianwu opened this issue Nov 1, 2018 · 10 comments
Labels
T: Performance Performance improvements

Comments

@yutianwu
Copy link
Contributor

yutianwu commented Nov 1, 2018

hi, recently, I have done some stress testing for our Dapp, I found that besides performance of secp256, encode and decode of account is also time consuming.

I have done a benchmark test in my laptop(MacBook Pro (15-inch, 2018), 2.2 GHz Intel Core i7).

GetAccount:

func BenchmarkAccountMapperGetAccountFound(b *testing.B) {
	ms, capKey, _ := setupMultiStore()
	cdc := codec.New()
	RegisterBaseAccount(cdc)

	// make context and mapper
	ctx := sdk.NewContext(ms, abci.Header{}, false, log.NewNopLogger())
	mapper := NewAccountKeeper(cdc, capKey, ProtoBaseAccount)

	// assumes b.N < 2**24
	for i := 0; i < b.N; i++ {
		arr := []byte{byte((i & 0xFF0000) >> 16), byte((i & 0xFF00) >> 8), byte(i & 0xFF)}
		addr := sdk.AccAddress(arr)
		acc := mapper.NewAccountWithAddress(ctx, addr)
		acc.SetCoins(sdk.Coins{sdk.NewCoin("LTC", sdk.NewInt(1000)),
			sdk.NewCoin("BTC", sdk.NewInt(1000)),
			sdk.NewCoin("ETH", sdk.NewInt(1000)),
			sdk.NewCoin("XRP", sdk.NewInt(1000)),
			sdk.NewCoin("BCH", sdk.NewInt(1000)),
			sdk.NewCoin("EOS", sdk.NewInt(1000))})
		mapper.SetAccount(ctx, acc)
	}

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		arr := []byte{byte((i & 0xFF0000) >> 16), byte((i & 0xFF00) >> 8), byte(i & 0xFF)}
		mapper.GetAccount(ctx, sdk.AccAddress(arr))
	}
}

/*
goos: darwin
goarch: amd64
pkg: github.com/cosmos/cosmos-sdk/x/auth
BenchmarkAccountMapperGetAccountFound-12    	  100000	     16937 ns/op
PASS
*/

SetAccount

func BenchmarkAccountMapperSetAccount(b *testing.B) {
	ms, capKey, _ := setupMultiStore()
	cdc := codec.New()
	RegisterBaseAccount(cdc)

	// make context and mapper
	ctx := sdk.NewContext(ms, abci.Header{}, false, log.NewNopLogger())
	mapper := NewAccountKeeper(cdc, capKey, ProtoBaseAccount)

	b.ResetTimer()
	// assumes b.N < 2**24
	for i := 0; i < b.N; i++ {
		arr := []byte{byte((i & 0xFF0000) >> 16), byte((i & 0xFF00) >> 8), byte(i & 0xFF)}
		addr := sdk.AccAddress(arr)
		acc := mapper.NewAccountWithAddress(ctx, addr)
		acc.SetCoins(sdk.Coins{sdk.NewCoin("LTC", sdk.NewInt(1000)),
			sdk.NewCoin("BTC", sdk.NewInt(1000)),
			sdk.NewCoin("ETH", sdk.NewInt(1000)),
			sdk.NewCoin("XRP", sdk.NewInt(1000)),
			sdk.NewCoin("BCH", sdk.NewInt(1000)),
			sdk.NewCoin("EOS", sdk.NewInt(1000))})
		mapper.SetAccount(ctx, acc)
	}
}

/*
goos: darwin
goarch: amd64
pkg: github.com/cosmos/cosmos-sdk/x/auth
BenchmarkAccountMapperSetAccount-12    	  100000	     22989 ns/op
PASS
*/

Assume that there is 4000tps in our dapp, set and get account will cost (16937 + 22989) * 4000 = 159.7 ms. The result is kind of unacceptable.

IMHO, I think we can cache the deserialized result so we may don't need to deserialize accounts again and again. It will save a half of the time.

@yutianwu yutianwu changed the title Get and set account is kind of time consumint Get and set account is kind of time consuming Nov 1, 2018
@ValarDragon
Copy link
Contributor

ValarDragon commented Nov 1, 2018

Thanks for pointing this out! We have plans for a cache mainly for generalized amino decodings. (i.e. an fully associative LRU "L2" cache ) I think it is very reasonable to make an "L1" cache just for account decoding though. I made such an example for a staking bottleneck awhile ago.

I ran Get through the profiler. I have some good news and some bad news. The bottleneck here is the time is mostly amino. (70% of the time taken was in amino, ~25% in iavl, and the remainder in some inefficiencies in how we deal with slicing)

The good news is that I've already knocked off 3 percentage points of what was causing this amino delay, and we'll get that speed boost once we switch to the latest go-amino version in the sdk! But uhh, 3% is not that helpful.

I'll briefly describe the amino bottlenecks in this particular situation (note: numbers are approximate).

  • Almost all of the amino internal values are getting heap allocated. Fixing this ( Remove the usage of "writers" for speedups tendermint/go-amino#226) appears to me that it would cause an approximately 5-7 total percentage points of speedup
  • Theres currently a write lock on amino type info. I never understood the reason behind this, if we can remove this, it gives 10 total percentage points of speed back. Unclear if we can, I haven't talked to anyone about this, so I may be missing context for its existence
  • The remainder is mostly lost in reflection land, its unclear to me if this can be sped up without building our #verypostlaunch plans of making an amino compiler.

So we should be able to drop the GetTime by a fifth just via fairly not hard amino upgrades, and then use an LRU cache to avoid duped decoding!

/cc @liamsi

@ValarDragon ValarDragon added the T: Performance Performance improvements label Nov 1, 2018
@ValarDragon
Copy link
Contributor

ValarDragon commented Nov 1, 2018

Oh didn't notice this earlier. The implicit assumption of 4000 tps in the DApp is unrealistic in the current software, for a meaningfully long running chain. 4000 signature verifications will take much longer than a second. Even ignoring the signature verification time, KMS signing time, and the p2p network delay, the state tree will be balloon in size since we are in the account model with no pruning of old accounts. The state tree blow-up will then have I/O such that 4000 tps is no longer feasible. (Recall that at 4000 tps, each tx must take a quarter of a milisecond, its highly doubtful imo that sequential account reads will be on the similar enough iavl paths to facilitate this.)

I am still all in favor of reducing this time, since I do agree that it seems too high for such a basic operation!

@yutianwu
Copy link
Contributor Author

yutianwu commented Nov 1, 2018

Yeah, 4000 tps seems unreasonable now, just for an example. For now, amino takes too much time and signature verification as well.

@ValarDragon I don't fully understand the L2 cache for amino, can you kindly give some more details?

Actually, we kinda hope that we can also cache the IAVL tree in memory and do not clear the checkState and DeliverState at Commit stage. But IAVAL tree empties branches every time when committing, it is useless to remain DeliverState to next block. Is there any plan to not clearing IAVL tree when committing?

@zmanian
Copy link
Member

zmanian commented Nov 1, 2018

So my point of view is that account state should be special cased for amino serialization and use a non-reflection based implemention either derived from a protobuf implementation or handrolled.

I don't think we need to wait for some future amino compiler.

I think the biggest thing is just figure out what the specialized serialization/deserialization interface should look like

@alexanderbez
Copy link
Contributor

It seems the actionable item out of this for now is to create an "L1" LRU cache similar to what we do in staking. Amino improvements and modifications will come later.

@yutianwu
Copy link
Contributor Author

yutianwu commented Nov 2, 2018

It seems the actionable item out of this for now is to create an "L1" LRU cache similar to what we do in staking. Amino improvements and modifications will come later.

yeah, it is reasonable to add LRU cache to account, and replacement of amino can be done after that.

but the use case of account is kind of different from validators. When getting an account from store, usually we want to change it. so we can not just simply cache the marshalled account.

I did not dive into the use of the validator get from cache. but I have a concern that can we get a wrong state of validator if somewhere changes some properties of validator and not saves it to store for we do not deep copy the validator. @ValarDragon

@ValarDragon
Copy link
Contributor

Agreed @zmanian! I think that would have a really large impact on over-all state machine performance. (Probably the most impact after we optimize IAVL performance and any asymptotic improvements)

@yutianwu that'd be great! Here's an example you can model it off of:

// Cache the amino decoding of validators, as it can be the case that repeated slashing calls

@yutianwu
Copy link
Contributor Author

yutianwu commented Nov 2, 2018

IMHO, the use case of account is different from validator. account is changed more frequently than validator.

I think we should use the address of account as the key of the cache map and flush all changes to disk in Commit.

What do you you think? @ValarDragon

@yutianwu
Copy link
Contributor Author

yutianwu commented Nov 2, 2018

I have a rough idea:

  1. Add block life cycle cache like CheckState for account, all changes will happen in unmarshalled Account. Then we will marshal updated Accounts and flush them to disk. During one block, we will only need to unmarshal and marshal an Account once no matter how many times we get and set an Account.
  2. Add LRU cache for marshalled Account, like what we do in staking so that we don't need to unmarshal the same Accounts. But there is a different from cache of Validators, we need to make a deep copy of every Account we get from cache for we may change the properties of Account and we will not get the original state of a Account the next time.

But I have done a benchmark of deepcopy library, its performance is not that good for using reflect.

PS:
I have a concern that can we get a wrong state of validator if somewhere changes some properties of validator and not saves it to store for we do not deep copy the validator. @alexanderbez @ValarDragon @zmanian

@tac0turtle
Copy link
Member

account structure has changed recently. The bank module is responsible for storing coin data about the account. Closing this issue, please reopen or carry the actionable into a new issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T: Performance Performance improvements
Projects
None yet
Development

No branches or pull requests

6 participants