You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2022-08-12T20:37:21.8572085Z stage-2: chain: 2022-08-12T20:37:21.840Z SwingSet: kernel: anachrophobia strikes vat v7
2022-08-12T20:37:21.8577230Z stage-2: chain: 2022-08-12T20:37:21.841Z SwingSet: kernel: expected: {"0":"vatstoreSet","1":"vom.o+11/5","2":"{\"currentBalance\":{\"body\":\"{\\\"brand\\\":{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: IST brand\\\",\\\"index\\\":0},\\\"value\\\":{\\\"@qclass\\\":\\\"bigint\\\",\\\"digits\\\":\\\"0\\\"}}\",\"slots\":[\"o+14/1\"]},\"recoverySet\":{\"body\":\"{\\\"@qclass\\\":\\\"slot\\\",\\\"iface\\\":\\\"Alleged: setStore\\\",\\\"index\\\":0}\",\"slots\":[\"o+8/29\"]}}","length":3}
2022-08-12T20:37:21.8579093Z stage-2: chain: 2022-08-12T20:37:21.841Z SwingSet: kernel: got : {"0":"vatstoreGet","1":"vc.12.|schemata","length":2}
2022-08-12T20:37:21.8684929Z stage-2: chain: portHandler threw (Error#1)
2022-08-12T20:37:21.8689099Z stage-2: chain: Error#1: historical inaccuracy in replay of v7
2022-08-12T20:37:21.8692132Z stage-2: chain:
2022-08-12T20:37:21.8695606Z stage-2: chain: at requireIdentical (.../swingset-vat/src/kernel/vat-loader/transcript.js:8:8)
2022-08-12T20:37:21.8720355Z stage-2: chain: at Object.simulateSyscall (.../swingset-vat/src/kernel/vat-loader/transcript.js:63:22)
2022-08-12T20:37:21.8734717Z stage-2: chain: at Object.syscallFromWorker (.../swingset-vat/src/kernel/vat-loader/manager-helper.js:249:30)
2022-08-12T20:37:21.8739089Z stage-2: chain: at handleUpstream (.../swingset-vat/src/kernel/vat-loader/manager-subprocess-xsnap.js:86:11)
2022-08-12T20:37:21.8743272Z stage-2: chain: at handleCommand (.../swingset-vat/src/kernel/vat-loader/manager-subprocess-xsnap.js:109:14)
2022-08-12T20:37:21.8746933Z stage-2: chain: at handleCommand (packages/xsnap/src/replay.js:111:26)
2022-08-12T20:37:21.8750340Z stage-2: chain: at runToIdle (packages/xsnap/src/xsnap.js:203:42)
2022-08-12T20:37:21.8753859Z stage-2: chain: at runMicrotasks (<anonymous>)
It went away after a trivial commit (that shouldn't have changed anything), which is terrifying, but it also made me think that it'd be nice to capture more context in that situation.
I'm thinking that instead of only printing the one transcript entry that we were expecting, we should find a way to print the last 5 or 10 entries.
Also, sometimes the divergence is that the original did syscalls A/B/C, and the replay did B/A/C or A/C/B (swapping the order of two, like if somehow GC finalizers fired in a different order and we didn't sort properly to mask it). It'd be nice to be able to distinguish this from a replay that only did A/B or only A/C. To capture that, I think we'd have to allow the delivery to run to completion, and then only compare syscalls at the very end. Or, allow N more syscalls to be made before we stop accepting any more and just report the error. This sounds trickier, I'm not sure what it would take to implement it safely.
Description of the Design
Not sure yet
Security Considerations
We'd need to maintain the invariant that the kernel halts before committing any state changes. Beyond that, it ought to be safe to let the doomed vat finish out its last delivery.
Test Plan
This sort of logging is pretty tough to test, I'd be satisfied with a manual inspection of a deliberately provoked error. However, we would like confidence that the error can be emitted without e.g. a logging exception causing the final panic() to be bypassed, which would then fail the "kernel should halt" invariant.
So probably a unit test which triggers anachrophobia, and then asserts that the kernel has paniced. Then we can manually look at the log messages and decide that they're correct.
The text was updated successfully, but these errors were encountered:
My 6770 branch will only print a single transcript entry, but it will print all expected syscalls, and all executed syscalls, and highlight the point of divergence.
What is the Problem Being Solved?
@turadg encountered a seemingly-intermittent error today in the CI integration test:
It went away after a trivial commit (that shouldn't have changed anything), which is terrifying, but it also made me think that it'd be nice to capture more context in that situation.
I'm thinking that instead of only printing the one transcript entry that we were expecting, we should find a way to print the last 5 or 10 entries.
Also, sometimes the divergence is that the original did syscalls A/B/C, and the replay did B/A/C or A/C/B (swapping the order of two, like if somehow GC finalizers fired in a different order and we didn't sort properly to mask it). It'd be nice to be able to distinguish this from a replay that only did A/B or only A/C. To capture that, I think we'd have to allow the delivery to run to completion, and then only compare syscalls at the very end. Or, allow N more syscalls to be made before we stop accepting any more and just report the error. This sounds trickier, I'm not sure what it would take to implement it safely.
Description of the Design
Not sure yet
Security Considerations
We'd need to maintain the invariant that the kernel halts before committing any state changes. Beyond that, it ought to be safe to let the doomed vat finish out its last delivery.
Test Plan
This sort of logging is pretty tough to test, I'd be satisfied with a manual inspection of a deliberately provoked error. However, we would like confidence that the error can be emitted without e.g. a logging exception causing the final
panic()
to be bypassed, which would then fail the "kernel should halt" invariant.So probably a unit test which triggers anachrophobia, and then asserts that the kernel has paniced. Then we can manually look at the log messages and decide that they're correct.
The text was updated successfully, but these errors were encountered: