Get and set account is kind of time consuming #2652

yutianwu · 2018-11-01T03:40:42Z

hi, recently, I have done some stress testing for our Dapp, I found that besides performance of secp256, encode and decode of account is also time consuming.

I have done a benchmark test in my laptop(MacBook Pro (15-inch, 2018), 2.2 GHz Intel Core i7).

GetAccount:

func BenchmarkAccountMapperGetAccountFound(b *testing.B) {
	ms, capKey, _ := setupMultiStore()
	cdc := codec.New()
	RegisterBaseAccount(cdc)

	// make context and mapper
	ctx := sdk.NewContext(ms, abci.Header{}, false, log.NewNopLogger())
	mapper := NewAccountKeeper(cdc, capKey, ProtoBaseAccount)

	// assumes b.N < 2**24
	for i := 0; i < b.N; i++ {
		arr := []byte{byte((i & 0xFF0000) >> 16), byte((i & 0xFF00) >> 8), byte(i & 0xFF)}
		addr := sdk.AccAddress(arr)
		acc := mapper.NewAccountWithAddress(ctx, addr)
		acc.SetCoins(sdk.Coins{sdk.NewCoin("LTC", sdk.NewInt(1000)),
			sdk.NewCoin("BTC", sdk.NewInt(1000)),
			sdk.NewCoin("ETH", sdk.NewInt(1000)),
			sdk.NewCoin("XRP", sdk.NewInt(1000)),
			sdk.NewCoin("BCH", sdk.NewInt(1000)),
			sdk.NewCoin("EOS", sdk.NewInt(1000))})
		mapper.SetAccount(ctx, acc)
	}

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		arr := []byte{byte((i & 0xFF0000) >> 16), byte((i & 0xFF00) >> 8), byte(i & 0xFF)}
		mapper.GetAccount(ctx, sdk.AccAddress(arr))
	}
}

/*
goos: darwin
goarch: amd64
pkg: github.com/cosmos/cosmos-sdk/x/auth
BenchmarkAccountMapperGetAccountFound-12    	  100000	     16937 ns/op
PASS
*/

SetAccount

func BenchmarkAccountMapperSetAccount(b *testing.B) {
	ms, capKey, _ := setupMultiStore()
	cdc := codec.New()
	RegisterBaseAccount(cdc)

	// make context and mapper
	ctx := sdk.NewContext(ms, abci.Header{}, false, log.NewNopLogger())
	mapper := NewAccountKeeper(cdc, capKey, ProtoBaseAccount)

	b.ResetTimer()
	// assumes b.N < 2**24
	for i := 0; i < b.N; i++ {
		arr := []byte{byte((i & 0xFF0000) >> 16), byte((i & 0xFF00) >> 8), byte(i & 0xFF)}
		addr := sdk.AccAddress(arr)
		acc := mapper.NewAccountWithAddress(ctx, addr)
		acc.SetCoins(sdk.Coins{sdk.NewCoin("LTC", sdk.NewInt(1000)),
			sdk.NewCoin("BTC", sdk.NewInt(1000)),
			sdk.NewCoin("ETH", sdk.NewInt(1000)),
			sdk.NewCoin("XRP", sdk.NewInt(1000)),
			sdk.NewCoin("BCH", sdk.NewInt(1000)),
			sdk.NewCoin("EOS", sdk.NewInt(1000))})
		mapper.SetAccount(ctx, acc)
	}
}

/*
goos: darwin
goarch: amd64
pkg: github.com/cosmos/cosmos-sdk/x/auth
BenchmarkAccountMapperSetAccount-12    	  100000	     22989 ns/op
PASS
*/

Assume that there is 4000tps in our dapp, set and get account will cost (16937 + 22989) * 4000 = 159.7 ms. The result is kind of unacceptable.

IMHO, I think we can cache the deserialized result so we may don't need to deserialize accounts again and again. It will save a half of the time.

ValarDragon · 2018-11-01T05:54:49Z

Thanks for pointing this out! We have plans for a cache mainly for generalized amino decodings. (i.e. an fully associative LRU "L2" cache ) I think it is very reasonable to make an "L1" cache just for account decoding though. I made such an example for a staking bottleneck awhile ago.

I ran Get through the profiler. I have some good news and some bad news. The bottleneck here is the time is mostly amino. (70% of the time taken was in amino, ~25% in iavl, and the remainder in some inefficiencies in how we deal with slicing)

The good news is that I've already knocked off 3 percentage points of what was causing this amino delay, and we'll get that speed boost once we switch to the latest go-amino version in the sdk! But uhh, 3% is not that helpful.

I'll briefly describe the amino bottlenecks in this particular situation (note: numbers are approximate).

Almost all of the amino internal values are getting heap allocated. Fixing this ( Remove the usage of "writers" for speedups tendermint/go-amino#226) appears to me that it would cause an approximately 5-7 total percentage points of speedup
Theres currently a write lock on amino type info. I never understood the reason behind this, if we can remove this, it gives 10 total percentage points of speed back. Unclear if we can, I haven't talked to anyone about this, so I may be missing context for its existence
The remainder is mostly lost in reflection land, its unclear to me if this can be sped up without building our #verypostlaunch plans of making an amino compiler.

So we should be able to drop the GetTime by a fifth just via fairly not hard amino upgrades, and then use an LRU cache to avoid duped decoding!

/cc @liamsi

ValarDragon · 2018-11-01T06:12:11Z

Oh didn't notice this earlier. The implicit assumption of 4000 tps in the DApp is unrealistic in the current software, for a meaningfully long running chain. 4000 signature verifications will take much longer than a second. Even ignoring the signature verification time, KMS signing time, and the p2p network delay, the state tree will be balloon in size since we are in the account model with no pruning of old accounts. The state tree blow-up will then have I/O such that 4000 tps is no longer feasible. (Recall that at 4000 tps, each tx must take a quarter of a milisecond, its highly doubtful imo that sequential account reads will be on the similar enough iavl paths to facilitate this.)

I am still all in favor of reducing this time, since I do agree that it seems too high for such a basic operation!

yutianwu · 2018-11-01T07:56:59Z

Yeah, 4000 tps seems unreasonable now, just for an example. For now, amino takes too much time and signature verification as well.

@ValarDragon I don't fully understand the L2 cache for amino, can you kindly give some more details?

Actually, we kinda hope that we can also cache the IAVL tree in memory and do not clear the checkState and DeliverState at Commit stage. But IAVAL tree empties branches every time when committing, it is useless to remain DeliverState to next block. Is there any plan to not clearing IAVL tree when committing?

zmanian · 2018-11-01T13:11:58Z

So my point of view is that account state should be special cased for amino serialization and use a non-reflection based implemention either derived from a protobuf implementation or handrolled.

I don't think we need to wait for some future amino compiler.

I think the biggest thing is just figure out what the specialized serialization/deserialization interface should look like

alexanderbez · 2018-11-01T17:12:15Z

It seems the actionable item out of this for now is to create an "L1" LRU cache similar to what we do in staking. Amino improvements and modifications will come later.

yutianwu · 2018-11-02T02:36:15Z

It seems the actionable item out of this for now is to create an "L1" LRU cache similar to what we do in staking. Amino improvements and modifications will come later.

yeah, it is reasonable to add LRU cache to account, and replacement of amino can be done after that.

but the use case of account is kind of different from validators. When getting an account from store, usually we want to change it. so we can not just simply cache the marshalled account.

I did not dive into the use of the validator get from cache. but I have a concern that can we get a wrong state of validator if somewhere changes some properties of validator and not saves it to store for we do not deep copy the validator. @ValarDragon

ValarDragon · 2018-11-02T04:07:51Z

Agreed @zmanian! I think that would have a really large impact on over-all state machine performance. (Probably the most impact after we optimize IAVL performance and any asymptotic improvements)

@yutianwu that'd be great! Here's an example you can model it off of:

cosmos-sdk/x/stake/keeper/validator.go

Line 12 in 60d188d

    
           // Cache the amino decoding of validators, as it can be the case that repeated slashing calls

yutianwu · 2018-11-02T06:24:48Z

IMHO, the use case of account is different from validator. account is changed more frequently than validator.

I think we should use the address of account as the key of the cache map and flush all changes to disk in Commit.

What do you you think? @ValarDragon

yutianwu · 2018-11-02T08:25:56Z

I have a rough idea:

Add block life cycle cache like CheckState for account, all changes will happen in unmarshalled Account. Then we will marshal updated Accounts and flush them to disk. During one block, we will only need to unmarshal and marshal an Account once no matter how many times we get and set an Account.
Add LRU cache for marshalled Account, like what we do in staking so that we don't need to unmarshal the same Accounts. But there is a different from cache of Validators, we need to make a deep copy of every Account we get from cache for we may change the properties of Account and we will not get the original state of a Account the next time.

But I have done a benchmark of deepcopy library, its performance is not that good for using reflect.

PS:
I have a concern that can we get a wrong state of validator if somewhere changes some properties of validator and not saves it to store for we do not deep copy the validator. @alexanderbez @ValarDragon @zmanian

tac0turtle · 2020-07-03T07:43:07Z

account structure has changed recently. The bank module is responsible for storing coin data about the account. Closing this issue, please reopen or carry the actionable into a new issue

yutianwu changed the title ~~Get and set account is kind of time consumint~~ Get and set account is kind of time consuming Nov 1, 2018

yutianwu mentioned this issue Nov 1, 2018

[R4R] Add benchmark for get and set account #2653

Merged

ValarDragon added the T: Performance Performance improvements label Nov 1, 2018

ValarDragon added the post-launch label Nov 1, 2018

yutianwu mentioned this issue Nov 4, 2018

WIP: [x/auth] Add lru cache for account #2678

Closed

fedekunze removed the post-launch label Jul 16, 2019

tac0turtle closed this as completed Jul 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get and set account is kind of time consuming #2652

Get and set account is kind of time consuming #2652

yutianwu commented Nov 1, 2018 •

edited

Loading

ValarDragon commented Nov 1, 2018 •

edited

Loading

ValarDragon commented Nov 1, 2018 •

edited

Loading

yutianwu commented Nov 1, 2018 •

edited

Loading

zmanian commented Nov 1, 2018

alexanderbez commented Nov 1, 2018

yutianwu commented Nov 2, 2018 •

edited

Loading

ValarDragon commented Nov 2, 2018

yutianwu commented Nov 2, 2018

yutianwu commented Nov 2, 2018

tac0turtle commented Jul 3, 2020

Get and set account is kind of time consuming #2652

Get and set account is kind of time consuming #2652

Comments

yutianwu commented Nov 1, 2018 • edited Loading

ValarDragon commented Nov 1, 2018 • edited Loading

ValarDragon commented Nov 1, 2018 • edited Loading

yutianwu commented Nov 1, 2018 • edited Loading

zmanian commented Nov 1, 2018

alexanderbez commented Nov 1, 2018

yutianwu commented Nov 2, 2018 • edited Loading

ValarDragon commented Nov 2, 2018

yutianwu commented Nov 2, 2018

yutianwu commented Nov 2, 2018

tac0turtle commented Jul 3, 2020

yutianwu commented Nov 1, 2018 •

edited

Loading

ValarDragon commented Nov 1, 2018 •

edited

Loading

ValarDragon commented Nov 1, 2018 •

edited

Loading

yutianwu commented Nov 1, 2018 •

edited

Loading

yutianwu commented Nov 2, 2018 •

edited

Loading