Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filecoin Virtual Machine integration #8293

Merged
merged 37 commits into from
Mar 16, 2022
Merged

Filecoin Virtual Machine integration #8293

merged 37 commits into from
Mar 16, 2022

Conversation

raulk
Copy link
Member

@raulk raulk commented Mar 11, 2022

Related (but not blocked by) filecoin-project/filecoin-ffi#217

This PR integrates the reference FVM into Lotus. In doing so, it also integrates the new Rust-based builtin actors through the FFI.

The salient changes are:

  • Abstracting out the Virtual Machine in Lotus with a new Virtual Machine Interface that is implemented by the Lotus VM (suggested by @raulk to be renamed to Legacy VM) and the new FVM struct in Lotus.
  • Integrating the FVM and builtin actors into the FFI (see FVM integration filecoin-ffi#217). The FFI exposes new endpoints to:
    • Construct FVMs
    • Execute messages (both implicit and explicit)
    • Flush
  • Refactor the circulating supply logic to supply a partially-calculated value to the FVM for v14 (and before) only. This logic can be dropped if FVM support for v14 is not desired.
  • Put usage of the new FVM behind the LOTUS_USE_FVM_EXPERIMENTAL flag. If enabled, Lotus will use FVMs (and through it the Rust-based actors) instead of the Lotus VM and Golang-based actors.

arajasek and others added 30 commits February 4, 2022 12:46
1. Move lock, loading, etc into GetFilVested.
2. Call it directly when creating the FVM.
3. Detach GetFilLocked from state manager.

Really, this just makes it a bit easier to reason about this code.
…work-version

feat: cli: set current network version from params
Updates the FVM
This is what the lotus VM does.
Use either Lotus VM or FVM consistently
@arajasek arajasek marked this pull request as ready for review March 12, 2022 19:31
@arajasek arajasek requested a review from a team as a code owner March 12, 2022 19:31
@codecov
Copy link

codecov bot commented Mar 12, 2022

Codecov Report

Merging #8293 (9ea623e) into master (362c73b) will decrease coverage by 0.11%.
The diff coverage is 26.42%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8293      +/-   ##
==========================================
- Coverage   40.34%   40.23%   -0.12%     
==========================================
  Files         679      681       +2     
  Lines       73961    74139     +178     
==========================================
- Hits        29842    29832      -10     
- Misses      38897    39076     +179     
- Partials     5222     5231       +9     
Impacted Files Coverage Δ
chain/stmgr/utils.go 21.84% <0.00%> (-0.76%) ⬇️
chain/vm/fvm.go 0.00% <0.00%> (ø)
chain/vm/gas.go 78.57% <ø> (ø)
chain/vm/gas_v0.go 92.94% <ø> (ø)
.../lotus-sim/simulation/blockbuilder/blockbuilder.go 0.00% <0.00%> (ø)
node/builder_chain.go 100.00% <ø> (ø)
chain/consensus/filcns/compute_state.go 71.59% <25.00%> (-2.53%) ⬇️
chain/vm/vmi.go 50.00% <50.00%> (ø)
chain/stmgr/supply.go 63.28% <61.90%> (-0.50%) ⬇️
chain/stmgr/call.go 65.94% <64.28%> (-5.57%) ⬇️
... and 29 more

github.com/filecoin-project/specs-actors/v7 v7.0.0-20211222192039-c83bea50c402/go.mod h1:p6LIOFezA1rgRLMewbvdi3Pp6SAu+q9FtJ9CAleSjrE=
github.com/filecoin-project/specs-actors/v7 v7.0.0-rc1.0.20220118005651-2470cb39827e/go.mod h1:TA5FwCna+Yi36POaT7SLKXsgEDvJwc0V/L6ZsO19B9M=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we haz a tag? whats new in go actors anyways? PRU fix?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk why go sum wants this tbh...but go sum thinks for itself (the version in go.mod is correct)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah just ignore go.sum, it only records all checksums seen as part of resolving deps. you can often end up with several older versions listed in go.sum compared to the one actually being used.

@@ -1943,6 +1942,7 @@ github.com/whyrusleeping/cbor-gen v0.0.0-20200826160007-0b9f6c5fb163/go.mod h1:f
github.com/whyrusleeping/cbor-gen v0.0.0-20210118024343-169e9d70c0c2/go.mod h1:fgkXqYy7bV2cFeIEOkVTZS/WjXARfBqSH6Q2qHL33hQ=
github.com/whyrusleeping/cbor-gen v0.0.0-20210219115102-f37d292932f2/go.mod h1:fgkXqYy7bV2cFeIEOkVTZS/WjXARfBqSH6Q2qHL33hQ=
github.com/whyrusleeping/cbor-gen v0.0.0-20210303213153-67a261a1d291/go.mod h1:fgkXqYy7bV2cFeIEOkVTZS/WjXARfBqSH6Q2qHL33hQ=
github.com/whyrusleeping/cbor-gen v0.0.0-20210713220151-be142a5ae1a8/go.mod h1:fgkXqYy7bV2cFeIEOkVTZS/WjXARfBqSH6Q2qHL33hQ=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tag plz

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no change from master here

@jennijuju
Copy link
Member

@raulk IIUC yushi implemented this but you opened pr - if so, you should review this

chain/consensus/filcns/compute_state.go Outdated Show resolved Hide resolved
chain/consensus/filcns/compute_state.go Outdated Show resolved Hide resolved
chain/stmgr/call.go Show resolved Hide resolved
Comment on lines 128 to 131
stTree, err := sm.StateTree(bstate)
if err != nil {
return nil, err
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be an error behaviour change in here. Mind explaining?

Copy link
Contributor

@arajasek arajasek Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raulk Well, there's a new potential error here, if that's what you mean, I can add a descreptive error message. The cause of the change is just that we don't have (or want) the StateTree method on the VM interface.

Comment on lines +249 to +257
stateCid, err = vmi.Flush(ctx)
if err != nil {
return nil, xerrors.Errorf("flushing vm: %w", err)
}

stTree, err := state.LoadStateTree(cbor.NewCborStore(buffStore), stateCid)
if err != nil {
return nil, xerrors.Errorf("loading state tree: %w", err)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Let's add a comment here stating why we need to flush and load the state tree.
  2. This Flush wouldn't be necessary with the intrinsic VM, so consider adding a specialization here, since the Lotus VM is not going anywhere for the foreseeable future and this would cause a small perf regression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Will do.
  2. Eh, I don't think I care enough -- the minor perf hit to simulating messages seems fine. I'd rather not have more IF FVM clauses than strictly necessary.

chain/vm/fvm.go Show resolved Hide resolved
}

func NewFVM(ctx context.Context, opts *VMOpts) (*FVM, error) {
circToReport := opts.FilVested
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there no solutions other than snaking through this FilVested field? Seems a little bit ad-hoc when combined with the pre-existing CircSupplyCalc field. And the fact that it won't be even used nv15 and forward, makes me even uneasier.

An alternative solution would be to expand the signature of CircSupplyCalc to take in the network version. Since the value of that is a StateManager member, it has all it needs to return FilVested if nv < 15.

Then this logic can be entirely independent of the network version, and we'd just pass the value down and let the FVM do the right thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raulk My priority here is not changing the existing logic for the Lotus VM. I would rather have a new field that the LotusVM entirely ignores -- that way if there's anything wrong with this PR, only FVM M0 testing is affected.

If it'll make you feel better, we can drop this entirely from the final released version of Lotus for the v16 upgrade and just always use the calculated CircSupply.

chain/vm/vmi.go Outdated Show resolved Hide resolved
chain/vm/vmi.go Outdated Show resolved Hide resolved
chain/vm/vmi.go Outdated Show resolved Hide resolved
@arajasek arajasek mentioned this pull request Mar 15, 2022
4 tasks
if os.Getenv("LOTUS_USE_FVM_EXPERIMENTAL") == "1" {
// This is needed so that the FVM does not have to duplicate the genesis vesting schedule, one
// of the components of the circ supply calc.
// This field is NOT needed by the LegacyVM, and also NOT needed by the FVM from v15 onwards.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What changed in v15 so that this would not be needed by FVM?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First Protocol Improvement / Bugfix mentioned here

(also adding a link in the comment)

@@ -95,12 +95,13 @@ func SetupStorageMiners(ctx context.Context, cs *store.ChainStore, sys vm.Syscal
Syscalls: mkFakedSigSyscalls(sys),
CircSupplyCalc: csc,
NetworkVersion: nv,
BaseFee: types.NewInt(0),
BaseFee: big.Zero(),
FilVested: big.Zero(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're only passing these opts into legacy VMs from here correct? So this is just nice zero value setting for clarity and won't cause v14 devnets to read bad values for the vested fil?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, yeah. Also Vested fil is equal to zero at genesis, so should be consistent.

@@ -64,6 +70,8 @@ func (sm *StateManager) Call(ctx context.Context, msg *types.Message, ts *types.
pheight = ts.Height() - 1
}

// Since we're simulating a future message, pretend we're applying it in the "next" tipset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a current message and we're executing it for real. "Current message is applied at parent height + 1"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarifying, thanks.

Copy link
Contributor

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one question.


// VerifyConsensusFault is similar to the one in syscalls.go used by the LegacyVM, except it never errors
// Errors are logged and "no fault" is returned, which is functionally what go-actors does anyway
func (x *FvmExtern) VerifyConsensusFault(ctx context.Context, a, b, extra []byte) (*ffi_cgo.ConsensusFault, int64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if any of those extern calls panic? Do we crash the node, or do we behave like the current VM and just fail execution?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@magik6k Thanks a lot for raising this. This is a somewhat open question about exactly what we wanna do -- currently:

  • Rust panics get converted to errors at the FFI layer
  • Extern panics such as this one are...unclear? Maybe also get lowered to errors at the FFI laye, maybe crash the node? @Stebalien isn't sure, we can relatively easy answer this by experimenting.
  • Any error returned by the FFI (including such downgraded panics) cause the ApplyMessage call to fail and the node to freeze. In terms of the LegacyVM, this means all such errors are "fatal errors". I'm fairly opposed to this, but a decision will be made later.

Copy link
Member Author

@raulk raulk Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extern panics such as this one are...unclear? Maybe also get lowered to errors at the FFI laye, maybe crash the node? @Stebalien isn't sure, we can relatively easy answer this by experimenting.

The Rust => Go call path is brand new, so this definitely deserves a ton of attention.

@arajasek arajasek merged commit 90d78f9 into master Mar 16, 2022
@arajasek arajasek deleted the feat/fvm branch March 16, 2022 19:51
Copy link
Member Author

@raulk raulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, barring @magik6k's very valid concern about panic handling/translation at the FFI layer. I can't approve because GitHub thinks I'm the author of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants