Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nomad operator raft/snapshot state doesn't stream results #11685

Open
tgross opened this issue Dec 15, 2021 · 3 comments
Open

nomad operator raft/snapshot state doesn't stream results #11685

tgross opened this issue Dec 15, 2021 · 3 comments

Comments

@tgross
Copy link
Member

tgross commented Dec 15, 2021

As pointed out in #11451 (comment), the nomad operator raft state and nomad operator snapshot state commands don't stream their results to the CLI either. But unlike the raft logs command, both of these commands have to restore the entire state store into memory (into a raft FSM), so there's not a lot of gain to stream the results except on the "serialization to JSON" step.

For #11684 we were able to sensibly serialize the logs to newline-delimited JSON because they're uniform objects (they were all logMessage). But for the state we have this StateAsMap method that turns it all into a giant map of arrays, one for each state object we're dumping. The objects don't have identifiers, but we could conceivably wrap each one in an object that identifies its type and then stream newline-delimited JSON for each of them. Something like:

{ "Type": "Node", "Object": { "ID": "6fb2cf73-589b-c0fa-9e10-9075f61f5c52", "Datacenter": "dc1", "etc": "etc" } }
{ "Type": "Job", "Object": { "ID": "example", "Datacenter": "dc1", "etc": "etc" } }

It makes examining the state a bit more fussy because you can't do something like nomad operator snapshot state | jq '.Node' to get all the nodes, and instead have to do something more like nomad operator snapshot state | jq '. | select(.Type == "Node") | .Object, but I'm not sure what tools other operators have built up around this command.

@davemay99 @schmichael do y'all have any thoughts on this?

@schmichael
Copy link
Member

The NDJSON + envelope version seems fine. If it's no extra work to keep both formats and put the NDJSON version behind a -stream flag that might be the best of both worlds. This is a low level developer productivity tool, so I think we can be a little fast and loose with the implementation.

Currently do the state commands use approximately less than or equal to the amount of memory the state took on the server they were running on? If so while it's a bit of a hassle to spin up a cloud vm just to split a JSON file into 1-file-per-map-key, it's not like these commands are useless for large clusters until a streaming version is added.

@tgross
Copy link
Member Author

tgross commented Dec 17, 2021

This is a low level developer productivity tool, so I think we can be a little fast and loose with the implementation.

Totally agreed.

Currently do the state commands use approximately less than or equal to the amount of memory the state took on the server they were running on? If so while it's a bit of a hassle to spin up a cloud vm just to split a JSON file into 1-file-per-map-key, it's not like these commands are useless for large clusters until a streaming version is added.

I haven't measured but it should be quite a bit more. We have to load the FSM (ref operator_raft_state.go#L75-L93) just like a real server does, and then encode all that data into a single JSON object (which should be larger b/c it's a less compact representation). Even if we still loaded the whole FSM, just being able to stream the JSON would cut down the memory requirements.

@davemay99
Copy link
Contributor

After dumping the state into one huge JSON file, I typically loop over the keys and extract each to a separate JSON file. It would save a ton of extra work if we streamed each table directly into a separate NDJSON file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants