Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silence validator vats #2658

Merged
merged 4 commits into from
Mar 16, 2021
Merged

Silence validator vats #2658

merged 4 commits into from
Mar 16, 2021

Conversation

michaelfig
Copy link
Member

@michaelfig michaelfig commented Mar 16, 2021

We don't want to scare away validators with error messages that are unrelated to the correct functioning of their node.

So, we disable vats' and liveslots' console.* when running ag-chain-cosmos start unless DEBUG=SwingSet:vat,SwingSet:ls. The current vat logging doesn't change for agoric start or for ag-solo.

@michaelfig michaelfig self-assigned this Mar 16, 2021
@michaelfig
Copy link
Member Author

@warner I've started on kernelEndowments.silentVats which is true when the kernel and workers should only call makeConsole(null) for all vats (thereby silencing them). I just haven't connected together the endowments of the silenced console to the vats in order to make it work.

@erights erights requested review from erights, warner, dckc and FUDCo March 16, 2021 06:17
@erights
Copy link
Member

erights commented Mar 16, 2021

Sorry. When I added myself as a reviewer I accidentally unlisted your other reviewers. Did I restore them correctly?

@erights
Copy link
Member

erights commented Mar 16, 2021

Oh, or were they reviewers? Maybe I was only seeing GitHub's suggestions? In any case, please undo any damage I did but keep me as a reviewer. Thanks.

Copy link
Contributor

@FUDCo FUDCo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK, at least w.r.t the SwingSet changes.

@warner
Copy link
Member

warner commented Mar 16, 2021

I'd suggest:

  • silenceVatConsole instead of silentVats
  • pass this through the kernelOptions= of buildKernel, rather than kernelEndowments, since it's just data. (the existing debugPrefix value should probably be moved to kernelOptions too, someday, so everything in kernelEndowments is an actual function that the kernel cannot obtain for itself)

I'm guessing we still need a way to deliver the silenceVatConsole or tag into the subprocess-based workers. I think this will be an additional argument to the setBundle message (which currently carries vatID, bundle, vatParameters, and virtualObjectCacheSize). manager-subprocess-xsnap.js sends this around line 151, the others (manager-subprocess-node.js and manager-nodeworker.js) should do something similar. Then the receiving/child-process side needs to extract the argument from the setBundle message and feed it into the makeConsole call.

@warner
Copy link
Member

warner commented Mar 16, 2021

BTW we should remember this overlaps with #2519 , a "consensus mode" switch which will (perhaps among other things) make console.log in the vats an absolute no-op (not even looking at the arguments). The silencing implemented in this PR might be sufficient for that, but since its goal is merely to not emit output, rather than to avoid non-deterministic reads of the arguments, I wouldn't commit to using silentVatConsole=true for consensusMode=true without an additional audit.

@warner
Copy link
Member

warner commented Mar 16, 2021

@michaelfig and I just talked and thought of a different (although overlapping) approach. The core problem was the "Temporary logging of marshalled error" that marshal.js does to console.log, which introduces a lot of noise (both in unit tests, where every time I see a stack trace I assume something has failed), and for validators. There are good reasons to log those errors (nobody else might, which means useful debugging information might be lost), but they also record a lot of not-really-a-problem errors, and errors that are caught and examined and maybe logged by thoughtful code elsewhere (but marshal can't tell).

One of the issues that came up was that locally-managed vats are each using an instance of liveslots that's created from the same makeLiveSlots function, which closes over a makeMarshal function, which uses console it gets from the kernelEndowments. So this "temporary logging" in marshal is using the kernel's console, not one that is specific to the each vat. If we wanted to silence logging on a per-vat basis, anything that happens in that vat's liveSlots or the encapsulated marshal is not using a per-vat console.

For now, since the goal is to have validators see less output, our plan is:

  • add a console argument to makeMarshal, use it for the "Temporary logging of error"
  • add a console argument to makeLiveSlots, pass it through to makeMarshal
  • change manager-local.js to give makeLiveSlots the same console it adds to the endowments of the Compartment it creates for the vat bundle
  • add a kernel option to change makeVatConsole to one that discards its output instead of logging it
  • enable this silenceVatConsole option for ag-chain-cosmos (maybe via some additional configuration mechanism that validators will get by default)
  • as extra credit, make this work for non-local vats (like xsnap):
    • add an option to the setBundle message sent to the child process
    • use that option to control the makeVatConsole used in the child
    • change the child's call to makeLiveSlots to include the console, just like with manager-local.js

@dckc
Copy link
Member

dckc commented Mar 16, 2021

...
For now, since the goal is to have validators see less output, our plan is:

  • add a console argument to makeMarshal, use it for the "Temporary logging of error"
  • add a console argument to makeLiveSlots, pass it through to makeMarshal
    ...

sounds like moving console from ambient authority to explicit authority. Always a good idea!

@michaelfig michaelfig force-pushed the mfig-silence-validator-vats branch from 35b4a09 to 87084b8 Compare March 16, 2021 19:53
@michaelfig michaelfig requested a review from FUDCo March 16, 2021 19:57
@michaelfig michaelfig force-pushed the mfig-silence-validator-vats branch from 87084b8 to 6db6ec3 Compare March 16, 2021 20:04
@michaelfig michaelfig marked this pull request as ready for review March 16, 2021 20:29
@warner
Copy link
Member

warner commented Mar 16, 2021

sounds like moving console from ambient authority to explicit authority. Always a good idea!

Well, kinda, except that 1: everybody expects a debug log function to be ambiently available, 2: if I'm debugging something and need to find out what's going on, I'm not going to be happy about modifying the entire call tree to pass in a debug log function. Certainly anything thats gets in my way when I'm stressed and under the gun to fix something is going to be a problem. As examined in #1318 , the difference between long-term "might be helpful to someone someday" log calls, and short-term exploratory "why the heck isn't my printf showing up, does that mean this code isn't even being called?" log calls, is pretty context-sensitive.

I think I'm ok with the intent to emit a debug message being ambient, but the caller/creator/next-frame-out gets to decide what actually happens when your code invokes it. If your intention was to write to a specific file, then yeah that's a good candidate for an argument-passed explicit authority. But when the intent is to just let someone bigger know what you're doing, then an ambient console.log or printf feels right to me, even if the call gets ignored in some circumstances.

@warner
Copy link
Member

warner commented Mar 16, 2021

@michaelfig I don't think this PR was supposed to change the xsnap/moddable submodule, right?

@michaelfig
Copy link
Member Author

@michaelfig I don't think this PR was supposed to change the xsnap/moddable submodule, right?

Whoops!

@michaelfig michaelfig force-pushed the mfig-silence-validator-vats branch from c97a856 to b7a9e84 Compare March 16, 2021 21:03
@michaelfig michaelfig requested a review from warner March 16, 2021 21:06
Copy link
Member

@warner warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, seems good, thanks for making those changes.

I'm leery of the "restart with $DEBUG=" approach to debugging a problem (as opposed to something which records the data the first time around and lets you extract it after you notice a problem), and the general non-discoverability of the prefixes/facilities/loglevel strings that you need to put into $DEBUG, but it's a reasonable step to take for now.

Copy link
Member

@erights erights left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to dig into our anylogger sometime. I do not yet understand how it does or should relate to our SES console.

I am not happy to see us dig our process.env hole deeper. But you're only doing it in places where we're already digging, and we haven't figured out what we want to do instead, so

LGTM

packages/marshal/src/types.js Show resolved Hide resolved
The default for this is the same "Temporary logging of sent error"
console.log.  However, this allows the caller to decide to do
something different to save the errors.
This allows the SwingSet application to use the console tag
in order to redirect errors in liveslots.  `SwingSet:ls:vXX` and
`SwingSet:vat:vXX` are now in use.
This helps quell the anxiety that validators get when bizarre
application errors happen.  They still appear when running locally
or an ag-solo, but not on-chain.

To reenable, use:
`DEBUG=SwingSet:vat,SwingSet:ls ag-chain-cosmos start`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants