enhancement: WASM error codes, early exit on error or kill switch #1337

lostman · 2023-09-07T08:37:25Z

Description

UPDATE: in addition to adding error codes to host functions as described below, this PR adds another host function:

fn early_exit(err_code: WasmIndexerError) -> !

Using this function we can remove more unwrap or expect calls.

Building on #1293.

This PR follows the example in wasmer repository:

https://github.com/wasmerio/wasmer/blob/master/examples/early_exit.rs#L60

    fn early_exit() -> Result<(), ExitCode> {
        // This is where it happens.
        Err(ExitCode(1))
    }

Note that the return value is Result<(), ExitCode> while the function is imported in WASM as

https://github.com/wasmerio/wasmer/blob/master/examples/early_exit.rs#L41

  (type $early_exit_t (func (param) (result)))

That is, as if it were fn early_exit();.

This PR does the same for our FFI functions. For instance:

fn put_object(
    mut env: FunctionEnvMut<IndexEnv>,
    type_id: i64,
    ptr: u32,
    len: u32,
) -> Result<(), WasmIndexerError> { }

While their types—as declared to the WASM module which will use them—doesn't mention the Result value:

https://github.com/FuelLabs/fuel-indexer/pull/1337/files#diff-447e55cb0fae7d02ac8fd8f7de5a521cbf60d3f2422b5166c39faeb9f995bea6R21-R29

The one difference between how things are handled in early_exit example versus this PR is getting the error code out:

https://github.com/wasmerio/wasmer/blob/master/examples/early_exit.rs#L93

When I tried the same method:

Err(e) => match e.downcast::<ExitCode>()

I got no error out of it. However, the error was printed as RuntimeError(User(EarlyExit)), so the error was handled correctly. (User indicates that a user-defined error triggered a WASM trap).

So, I ended up handling the error this way:

https://github.com/FuelLabs/fuel-indexer/pull/1337/files#diff-fa722c66d2896c684e5a26aa2c56b7ec1734888609fbb64ecbc7dfd6a420c4ffR933

            } else {
                if let Some(e) = e
                    .source()
                    .and_then(|e| e.downcast_ref::<WasmIndexerError>())
                {
                    error!("Indexer({uid}) WASM execution failed: {e}.");
                } else {
                    error!("Indexer({uid}) WASM execution failed: {e:?}.");
                };
                self.db.lock().await.revert_transaction().await?;
                return Err(IndexerError::from(e));
            }

I am unsure why I couldn't downcast the error the same way as was shown in the example.

Testing steps

Manual testing:

Start an indexer.

cargo run -p fuel-indexer -- run --fuel-node-host beta-4.fuel.network --fuel-node-port 80 --replace-indexer --manifest examples/fuel-explorer/fuel-explorer/fuel_explorer.manifest.yaml

Redeploy:

cargo run -p forc-index -- deploy --path examples/fuel-explorer/fuel-explorer --replace-indexer

Expected output:

2023-09-07T09:12:49.663539Z  INFO fuel_indexer::service: 399: Resuming Indexer(fuellabs.explorer) from block 1240
2023-09-07T09:12:54.321165Z  INFO fuel_indexer::database: 206: Database loading schema for Indexer(fuellabs.explorer) with Version(143093c6bbe3a8937f4fb71514cbe6799266c44398866dbbd23e12b977bc1641).
2023-09-07T09:12:54.346555Z  INFO fuel_indexer::executor: 110: Indexer(fuellabs.explorer) subscribing to Fuel node at beta-4.fuel.network:80
2023-09-07T09:12:54.346681Z  WARN fuel_indexer::executor: 117: No end_block specified in manifest. Indexer will run forever.
2023-09-07T09:12:54.346717Z  INFO fuel_indexer::service: 335: Indexer(fuellabs.explorer) was replaced. Stopping previous version of Indexer(fuellabs.explorer).
2023-09-07T09:12:54.347376Z ERROR fuel_indexer::executor: 941: Indexer(fuellabs.explorer) WASM execution failed: Kill switch has been triggered.
2023-09-07T09:12:54.347819Z  INFO fuel_indexer::executor: 199: Kill switch flipped, stopping Indexer(fuellabs.explorer). <('.')>

In particular, this line indicates that an early termination happened due to the kill switch:

2023-09-07T09:12:54.347376Z ERROR fuel_indexer::executor: 941: Indexer(fuellabs.explorer) WASM execution failed: Kill switch has been triggered.

Changelog

FFI functions can access the kill switch indicator and exit early when the kill switch has been triggered.
FFI functions can exit early on error, including sqlx database operation errors.
Add WASM error codes (from @deekerno's PR)
Get an error code from the call to the indexer's WASM module.

ra0x3 · 2023-09-08T15:15:52Z

@lostman

What's the difference between this and enhancement: implement error codes for WASM #1293 ?
Asking because this was ping'd for review but is based on develop (but description says it's a follow up)

lostman · 2023-09-08T15:27:56Z

@lostman What's the difference between this and #1293 ?

@deekerno's PR was a starting point. This PR adds a few things. From changelog:

FFI functions can access the kill switch indicator and exit early when the kill switch has been triggered. (added in this PR)
FFI functions can exit early on error, including sqlx database operation errors. (added in this PR)
Add WASM error codes (from @deekerno's PR)
Get an error code from the call to the indexer's WASM module. (added in this PR)

The most significant difference is how the error codes are returned:

e74645e#diff-447e55cb0fae7d02ac8fd8f7de5a521cbf60d3f2422b5166c39faeb9f995bea6L27-L28

In @deekerno's PR, the result is u32 error code. As I explain in the description, in this PR, I follow the early_exit example—the function signature, as declared in the FFI interface mentions no Result type. wasmer terminates the execution immediately if the host function returns an error this way (which is exactly what we want here).

This way, even get_object, which returns u32, can be terminated early:

https://github.com/FuelLabs/fuel-indexer/pull/1337/files#diff-25a226eeb0da9150a8af6c4267e5745445aab8a84cbe942fa13d8a7756246c11R119

It now returns Result<u32, WasmIndexerError>. However, in the FFI interface, it is still -> *mut u8.

ra0x3 · 2023-09-11T16:01:18Z

@lostman

Reading over your testing steps, I see the expected output for this PR, however, what is the result as of today on develop? Trying to compare what we have now (prior to this PR) to what we will have after this is merged

ra0x3

@lostman

Following your testing steps I get the following output:

2023-09-11T16:02:51.282146Z  WARN fuel_indexer::executor: 117: No end_block specified in manifest. Indexer will run forever.
2023-09-11T16:02:51.282175Z  INFO fuel_indexer::service: 335: Indexer(fuellabs.explorer) was replaced. Stopping previous version of Indexer(fuellabs.explorer).
2023-09-11T16:02:51.282689Z ERROR fuel_indexer::executor: 937: Indexer(fuellabs.explorer) WASM execution failed: Kill switch has been triggered.
2023-09-11T16:02:51.283092Z  INFO fuel_indexer::executor: 195: Kill switch flipped, stopping Indexer(fuellabs.explorer). <('.')>

UX Questions:

Why am I seeing an error! about something failing?
- My re-deployment worked so ideally I shouldn't see any errors (as this can be confusing)
  - Maybe we can warn! any additional details?
The last log message: "kill switch flipped" implies that the previous indexer was stopped (ok), but the message stops there, there's no additional info to suggest that my re-deployment worked

Still reviewing the PR. But the logs are confusing me as to what's actually happening (from a user perspective)

deekerno

Thanks for picking this up, @lostman! I left a few comments on the code itself.

Also, in addition to your manual testing, we should figure out a way to test the error codes themselves. Part of the reason my PR was still a WIP was that I hadn't figured out a good way to cause the errors (and that I was busy with other things 😅).

packages/fuel-indexer/src/database.rs

packages/fuel-indexer/src/ffi.rs

lostman · 2023-09-11T16:15:33Z

@ra0x3, agreed. I will add a special-case for the kill switch. That is, after all, the expected behavior.

ra0x3

Did a first pass
For changes like this it would give us more confidence if there was an associated test to prove the functionality works (we already don't have many service or executor tests as is)
- Tests also make the review easier

packages/fuel-indexer-macros/src/decoder.rs

packages/fuel-indexer-plugin/src/wasm.rs

packages/fuel-indexer/src/executor.rs

lostman · 2023-09-12T12:34:22Z

I added a WasmIndexExecutor test for exit codes. It is a bit crude but covers the basics.

…codes

ra0x3

Left some more feedback

@lostman I also wouldn't rush this PR
We do have a deadline, but this work should be able to make it in comfortably before that date, with plenty of time to spare
This is an extremely sensitive PR, so let's not rush it

packages/fuel-indexer-database/src/lib.rs

packages/fuel-indexer-api-server/src/uses.rs

packages/fuel-indexer-lib/src/lib.rs

ra0x3 · 2023-09-12T14:18:07Z

packages/fuel-indexer-macros/src/decoder.rs

@@ -833,7 +833,7 @@ impl From<ObjectDecoder> for TokenStream {
                                            .map(|query| query.to_string())
                                            .collect::<Vec<_>>();

-                                        d.lock().await.put_many_to_many_record(queries).await;
+                                        d.lock().await.put_many_to_many_record(queries).await.expect(&format!("Entity::save_many_to_many for {} failed.", stringify!(#ident)));


I think here we should be logging the error messages (and .expect) as close to the actual culprit call as possible

In this case the culprit of this error would be fuel_indexer::database::Database:: put_many_to_many_record

If the error is logged/handled there, we don't need to handle it here

The error is handled in put_many_to_many_record. In WASM, it is a host function that returns Result<>, which triggers early termination.

The code here is for native execution. This expect achieves an equivalent behavior.

ra0x3 · 2023-09-12T14:19:07Z

packages/fuel-indexer-macros/src/decoder.rs

@@ -868,7 +871,7 @@ impl From<ObjectDecoder> for TokenStream {
                                        Self::TYPE_ID,
                                        self.to_row(),
                                        serialize(&self.to_row())
-                                    ).await;
+                                    ).await.expect(&format!("Entity::save for {} failed.", stringify!(#ident)));


Same comment here:

I think here we should be logging the error messages (and .expect) as close to the actual culprit call as possible

In this case the culprit of this error would be fuel_indexer::database::Database:: put_object

If the error is logged/handled there, we don't need to handle it here

The error is handled in put_object. In WASM, it is a host function that returns Result<>, which triggers early termination.

The code here is for native execution. This expect achieves an equivalent behavior.

packages/fuel-indexer-tests/tests/service.rs

lostman · 2023-09-12T15:18:38Z

Left some more feedback

@lostman I also wouldn't rush this PR

We do have a deadline, but this work should be able to make it in comfortably before that date, with plenty of time to spare

This is an extremely sensitive PR, so let's not rush it

I'd rather merge this now (very soon) and have it battle-tested for a couple of weeks than merge it just before the deadline and have no time to resolve any issues that may arise.

I need this to finish #1150 and again, I'd rather test both for a couple of weeks before the deadline is upon us.

lostman · 2023-09-13T16:09:08Z

@ra0x3, @deekerno,

I added an early_exit function and tried using it in handler_block_wasm:

adee0f8#diff-7121bb4841992a7a257dbb97c0e60f371c59f2845b4413d5413b7adf779feb5aR22

and some trybuild tests failed to link, complaining about missing ff_early_exit at the linking stage:

https://github.com/FuelLabs/fuel-indexer/actions/runs/6171631718/job/16750296869#step:13:106

All integration tests succeeded, and I could compile and deploy indexers.

Any idea what could be going on?

It would've been a nice use-case for this function.

ra0x3 · 2023-09-13T19:06:58Z

Any idea what could be going on?

@lostman

In the trybuild tests, we manually add the missing FF symbols
Example
Let me know if this helps

lostman · 2023-09-13T20:58:56Z

@ra0x3, yes, it helps. Thanks!

Just pushed this:
46507e8

And trybuild tests pass 😄 (locally; waiting on CI)

ra0x3 · 2023-09-15T14:02:06Z

@lostman Is this mean to be re-reviewed? I do remember leaving a previous review, but I see I'm ping'd for review again. But I also see a lot of the feedback was not implemented (and as well no reason was provided as to why).

lostman · 2023-09-15T14:23:08Z

@ra0x3, yes, it is meant to be re-reviewed. I can't re-trigger the review request since it is already re-requested.

I implemented your feedback. What is missing?

ra0x3 · 2023-09-15T14:25:30Z

@lostman The review comments are unresolved, and they aren't labeled as "outdated". For any review comments where you implement the feedback, be sure to resolve the conversation (so it doesn't clutter the review, and look as if it's unresolved). And for anything you didn't implement, just leave a responding comment as to why you chose to to implement the feedback.

Co-authored-by: rashad <[email protected]>

lostman · 2023-09-15T14:36:42Z

Apologies. GitHub UI tricked me a little:

Completely missed these comments. 😂

ra0x3 · 2023-09-15T15:45:34Z

@lostman CI failing

lostman self-assigned this Sep 7, 2023

deekerno and others added 2 commits September 7, 2023 11:05

Implement error codes for FFI WASM functions

21e2e2f

early exit and kill switch

e74645e

lostman force-pushed the maciej/wasm-error-codes branch from cac8eaa to e74645e Compare September 7, 2023 09:07

lostman marked this pull request as ready for review September 7, 2023 13:59

lostman requested review from ra0x3 and deekerno as code owners September 7, 2023 13:59

This was referenced Sep 7, 2023

enhancement: store BlockData in the DB and ensure the indexers don't miss blocks #1297

Closed

Stop an indexer, or report an error, when database operation fails: "TypeId(-8293274664658733968) not found in tables: {}" #1176

Closed

ra0x3 reviewed Sep 11, 2023

View reviewed changes

deekerno suggested changes Sep 11, 2023

View reviewed changes

packages/fuel-indexer/src/database.rs Outdated Show resolved Hide resolved

packages/fuel-indexer/src/ffi.rs Show resolved Hide resolved

ra0x3 suggested changes Sep 11, 2023

View reviewed changes

review feedback

92e0139

lostman force-pushed the maciej/wasm-error-codes branch from 88d388b to 92e0139 Compare September 11, 2023 18:55

lostman added 7 commits September 11, 2023 20:55

Merge branch 'develop' into maciej/wasm-error-codes

69ca4bf

update comment

1b2318d

add early_exit FFI function and handle errors in load

712b489

fmt

018981f

add wasm exit codes test and fix an expect msg

969be90

fmt

b17177f

cargo sort

63920fa

lostman requested review from deekerno and ra0x3 September 12, 2023 12:49

Merge remote-tracking branch 'origin/develop' into maciej/wasm-error-…

a73f591

…codes

ra0x3 suggested changes Sep 12, 2023

View reviewed changes

lostman requested a review from ra0x3 September 12, 2023 16:14

ra0x3 linked an issue Sep 12, 2023 that may be closed by this pull request

Add error codes to WASM executors #1156

Closed

lostman force-pushed the maciej/wasm-error-codes branch 2 times, most recently from b6c0a78 to adee0f8 Compare September 13, 2023 11:09

improve ealy_exit function

a4549a8

lostman force-pushed the maciej/wasm-error-codes branch from adee0f8 to a4549a8 Compare September 13, 2023 11:37

lostman mentioned this pull request Sep 13, 2023

enhancement: don't allow missing blocks #1349

Merged

lostman added 2 commits September 13, 2023 22:51

return deserialization error if decoding blockdata fails

46507e8

simpler way to downcast

c295242

lostman force-pushed the maciej/wasm-error-codes branch from 184fbf7 to c295242 Compare September 13, 2023 20:58

lostman and others added 3 commits September 15, 2023 07:33

Update packages/fuel-indexer-lib/src/lib.rs

18f5bd3

Co-authored-by: rashad <[email protected]>

remove unused variant

a00ca22

Update packages/fuel-indexer-lib/src/lib.rs

0de8632

Co-authored-by: rashad <[email protected]>

Merge branch 'develop' into maciej/wasm-error-codes

146798c

ra0x3 previously approved these changes Sep 15, 2023

View reviewed changes

fmt

0baa5a4

lostman dismissed ra0x3’s stale review via 0baa5a4 September 15, 2023 17:27

deekerno approved these changes Sep 15, 2023

View reviewed changes

ra0x3 approved these changes Sep 15, 2023

View reviewed changes

ra0x3 merged commit 1b345b0 into develop Sep 15, 2023
18 checks passed

ra0x3 deleted the maciej/wasm-error-codes branch September 15, 2023 18:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhancement: WASM error codes, early exit on error or kill switch #1337

enhancement: WASM error codes, early exit on error or kill switch #1337

lostman commented Sep 7, 2023 •

edited

Loading

ra0x3 commented Sep 8, 2023 •

edited

Loading

lostman commented Sep 8, 2023

ra0x3 commented Sep 11, 2023

ra0x3 left a comment

deekerno left a comment

lostman commented Sep 11, 2023

ra0x3 left a comment

lostman commented Sep 12, 2023

ra0x3 left a comment

ra0x3 Sep 12, 2023

lostman Sep 12, 2023

ra0x3 Sep 12, 2023

lostman Sep 15, 2023 •

edited

Loading

lostman commented Sep 12, 2023

lostman commented Sep 13, 2023

ra0x3 commented Sep 13, 2023

lostman commented Sep 13, 2023

ra0x3 commented Sep 15, 2023

lostman commented Sep 15, 2023

ra0x3 commented Sep 15, 2023

lostman commented Sep 15, 2023

ra0x3 commented Sep 15, 2023

enhancement: WASM error codes, early exit on error or kill switch #1337

enhancement: WASM error codes, early exit on error or kill switch #1337

Conversation

lostman commented Sep 7, 2023 • edited Loading

Description

Testing steps

Changelog

ra0x3 commented Sep 8, 2023 • edited Loading

lostman commented Sep 8, 2023

ra0x3 commented Sep 11, 2023

ra0x3 left a comment

Choose a reason for hiding this comment

deekerno left a comment

Choose a reason for hiding this comment

lostman commented Sep 11, 2023

ra0x3 left a comment

Choose a reason for hiding this comment

lostman commented Sep 12, 2023

ra0x3 left a comment

Choose a reason for hiding this comment

ra0x3 Sep 12, 2023

Choose a reason for hiding this comment

lostman Sep 12, 2023

Choose a reason for hiding this comment

ra0x3 Sep 12, 2023

Choose a reason for hiding this comment

lostman Sep 15, 2023 • edited Loading

Choose a reason for hiding this comment

lostman commented Sep 12, 2023

lostman commented Sep 13, 2023

ra0x3 commented Sep 13, 2023

lostman commented Sep 13, 2023

ra0x3 commented Sep 15, 2023

lostman commented Sep 15, 2023

ra0x3 commented Sep 15, 2023

lostman commented Sep 15, 2023

ra0x3 commented Sep 15, 2023

lostman commented Sep 7, 2023 •

edited

Loading

ra0x3 commented Sep 8, 2023 •

edited

Loading

lostman Sep 15, 2023 •

edited

Loading