Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for an embedded NATS server #158

Merged
merged 18 commits into from
Apr 25, 2023
Merged

Conversation

bruth
Copy link
Contributor

@bruth bruth commented Jan 31, 2023

No description provided.

@bruth
Copy link
Contributor Author

bruth commented Feb 2, 2023

@brandond Just an FYI that I will be a pushing a few more commits, refactoring the code, simplifying it, and adding tests where appropriate.

One question I did have was whether the CI that runs is the canonical/sufficient test suite that changes need to be tested against? Also one heads up is that I saw there was indeed a failure (which I am fixing) in the CI, but it still shows the CI passed: https://drone-pr.k3s.io/k3s-io/kine/147/1/3

@bruth
Copy link
Contributor Author

bruth commented Feb 2, 2023

Sorry, one more question. There are various log statements being emitted within the NATS driver. Is there any policy/best practice recommended for logging given it will be embedded within k3s?

@brandond
Copy link
Member

brandond commented Mar 30, 2023

Just an FYI that I will be a pushing a few more commits, refactoring the code, simplifying it, and adding tests where appropriate.

Are you still working on this?

whether the CI that runs is the canonical/sufficient test suite that changes need to be tested against?

The CI is sufficient to confirm that it works with K3s, but our QA will still want to do a full set of tests on it when we update the Kine that is embedded in K3s itself.

There are various log statements being emitted within the NATS driver. Is there any policy/best practice recommended for logging given it will be embedded within k3s?

Don't be overly chatty to the point where it obscures the K3s logs. Also keep in mind that K3s will never set the log level to TRACE, so if you ever want to see TRACE-level logs you'll need to run it standalone.

It increases the size by ~26% (7,768,648 bytes).

For the build flag, I would like to see the NATS server stuff stubbed out so that it just returns an error if someone attempts to use the embedded server when it's not built in. All of the related code should use the build flag to ensure that we're not pulling in any extra modules when it's not enabled. I'm not sure we'll want to enable the embedded server in K3s, as 7MB is a non-trivial addition to the binary size.

@bruth
Copy link
Contributor Author

bruth commented Mar 30, 2023

Thanks for the response @brandond.

Are you still working on this?

Absolutely. I paused after the hackathon due to other priorities as well as wanting answers to these questions so I didn't go down a wrong path for the contribution.

The CI is sufficient to confirm that it works with K3s, ...

Ok. I will assume if the CI passes, that will be sufficient to then move onto the QA review and we can address more issues there.

Don't be overly chatty...

Understood, thanks.

For the build flag, I would like to see the NATS server stuff stubbed out so that it just returns an error...

Will address, thanks.

@bruth
Copy link
Contributor Author

bruth commented Apr 13, 2023

Just checking that I have picked up on this again and will have updates by tomorrow (Friday).

Copy link
Contributor Author

@bruth bruth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I paused my refactoring efforts and will put those into a subsequent PR. I will likely manifest as a rewrite of the backend to be more optimized.

The scope of this PR is now some config cleanup and additions, minor name changes for clarity (should be non-breaking, the existing kine URLs format/params are the same), and supporting an embedded server including a further opt-in for a pure socket between the client and server (no TCP). This would only be suitable for a single node deployment.

scripts/build Show resolved Hide resolved
pkg/drivers/nats/server/interface.go Outdated Show resolved Hide resolved
@bruth bruth marked this pull request as ready for review April 14, 2023 19:25
@bruth bruth requested a review from brandond April 14, 2023 21:08
pkg/endpoint/endpoint.go Outdated Show resolved Hide resolved
pkg/endpoint/endpoint.go Show resolved Hide resolved
scripts/build Show resolved Hide resolved
start-test() {
local ip=$(cat $TEST_DIR/databases/*/metadata/ip)
local port=$(cat $TEST_DIR/databases/*/metadata/port)
KINE_IMAGE=$IMAGE KINE_ENDPOINT="nats://$ip:$port?embedServer" provision-kine
Copy link
Member

@brandond brandond Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just to be clear, the way to use this is to start kine with a nats:// scheme datastore endpoint with the embedServer URI parameter set?

What do I get if I just do nats://, as you can do with sqlite to get a default local database?

Can you also do something like nats://127.0.0.1?embedServer? If so, what port does it use?

If I set the dontListen option when using the embedded server, are the address and port still required, or can I just do nats://?embedServer&dontListen?

Can you add some examples that cover these different possibilities to a new markdown file in the examples directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are correct. There are default values, such as port 4222 and the dontListen example is also valid. I added some comments to the backend code, but will add examples.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just adding myself as i want to try this out when the PR is merged..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an examples doc with more information.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bruth wil def help

examples/nats.md Outdated

- `nats://?embedServer`
- `nats://?embedServer&serverConfig=/path/to/server.conf`
- `nats://?embedServer&dontListen`
Copy link
Member

@brandond brandond Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding docs!

Could you expand a bit on this bit:

  • Why would someone want to use the dontListen option?
  • What's the default listen address and port?
  • If it does listen, can other nodes connect to this node as the external NATS URL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add to the doc, but will explain here first for context. We have had NATS users who need to embed the server and their environment is constrained to only exposing a controlled or limited set of ports. So they take advantage of using this non-TCP approach to establishing a connection between a local client and server in the same process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure that makes sense. I guess my question is, what is listening useful for? If you can't connect from another node when using the embedded server, why listen at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this would only be useful in a single node setup embedded k3s -> kine -> nats.. which may be very niche, but an option.

The more common setup is embedded that does listen and can cluster and/or be configured for specialized setups (the two-node use case I helped work on during the SUSE Hackathon for instance). The specialized setup will require additional opt-in custom code for NATS, but will enable for active-passive or active-active setups.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @bruth

what is the embedded nats can be optionally put in leaf node and paired with a global nats cluster ?

Then maybe devs can have their cake and eat it too:

either run in embedded standalone mode

OR

run in embedded leaf mode paired up to a global nats cluster like Synadia etc .

is this a sensible option I wonder ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you could configure it that way by passing a config file that declares the leafnodes block pointing to leaf node host. Other than the single KV bucket used in the backend, none of the other NATS capabilities are used. However, if you had a reason to mirror that KV back up to the hub cluster or even another leaf, that could definitely be done.

I have not tried it, but in theory you could have a read-only clone of another k3s+kine instance that uses a mirror of another instance. The second instance would receive all the KV changes over time (since its a stream under the hood) so the other k3s instance would mirror the same state (albeit lagging a tad since the replication is asynchronous).

Copy link

@gedw99 gedw99 Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you could configure it that way by passing a config file that declares the leafnodes block pointing to leaf node host. Other than the single KV bucket used in the backend, none of the other NATS capabilities are used. However, if you had a reason to mirror that KV back up to the hub cluster or even another leaf, that could definitely be done.

I have not tried it, but in theory you could have a read-only clone of another k3s+kine instance that uses a mirror of another instance. The second instance would receive all the KV changes over time (since its a stream under the hood) so the other k3s instance would mirror the same state (albeit lagging a tad since the replication is asynchronous).

So then no need for NATS cluster because Kine in nats mirror mode is a cluster.. have i got it right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming I understand correctly, if you wanted three kine instances with embedded nats, they could be clustered together via nats. If you wanted a kine instance/cluster hosting a mirror from another instance/cluster that is fine to.

The cluster is only necessary for the HA/FT, but you could have a single instance origin and single instance mirror if desired.

scripts/test Outdated Show resolved Hide resolved
scripts/test-run-nats-embedded Outdated Show resolved Hide resolved
@bruth
Copy link
Contributor Author

bruth commented Apr 19, 2023

I will rebase for the Go deps.

@bruth bruth force-pushed the embedded-nats branch 3 times, most recently from 2914803 to b4b47db Compare April 20, 2023 01:25
bruth added 11 commits April 19, 2023 21:26
Rename various identifiers to NATS rather than JetStream, which is a subsystem
of NATS.

Signed-off-by: Byron Ruth <[email protected]>
Signed-off-by: Byron Ruth <[email protected]>
Signed-off-by: Byron Ruth <[email protected]>
This reverts commit e22b756.

Signed-off-by: Byron Ruth <[email protected]>
Signed-off-by: Byron Ruth <[email protected]>
Signed-off-by: Byron Ruth <[email protected]>
bruth and others added 4 commits April 19, 2023 21:26
Signed-off-by: Byron Ruth <[email protected]>
Signed-off-by: Byron Ruth <[email protected]>
Co-authored-by: Brad Davidson <[email protected]>
Signed-off-by: Byron Ruth <[email protected]>
Co-authored-by: Brad Davidson <[email protected]>
Signed-off-by: Byron Ruth <[email protected]>
examples/nats.md Outdated Show resolved Hide resolved
@brandond
Copy link
Member

brandond commented Apr 20, 2023

I did a quick test of K3s with the nats server built in an it's actually not too terrible; it adds just over 1MiB to the shipping binary size.

// without nats
k3s binary dist/artifacts/k3s size 68452352 is less than max acceptable size of 73400320 bytes (70 MiB)
// with nats
k3s binary dist/artifacts/k3s size 69722112 is less than max acceptable size of 73400320 bytes (70 MiB)

I did notice that even when running with dontListen, the embedded server still seems to be listening? thats what the log says at least. I am also confused by the fact that it reports that it's connecting to a server at 0.0.0.0 which isn't possible.

brandond@dev01:~/go/src/github.com/k3s-io/k3s$ k3s server --token=token '--datastore-endpoint=nats://?embedServer&dontListen'
INFO[0000] Starting k3s v1.27.1+k3s-3988142f (3988142f)
INFO[0000] using an embedded NATS server
INFO[0000] started embedded NATS server
INFO[0000] using bucket: kine
INFO[0000] connecting to nats://0.0.0.0:4222
INFO[0000] Kine available at unix://kine.sock

@bruth
Copy link
Contributor Author

bruth commented Apr 20, 2023

^Hrm.. will check out why it is still listening or reporting as such.

Fix log line for embedded.

Signed-off-by: Byron Ruth <[email protected]>
@bruth
Copy link
Contributor Author

bruth commented Apr 20, 2023

Confirmed it was a misplaced log line, but does not actually listen.

@bruth
Copy link
Contributor Author

bruth commented Apr 20, 2023

@brandond If you are considering NATS can be built-in by default without requiring a build tag, then that may change the configuration design. Let me know either way and will update the code/docs to acknowledge that.

@brandond
Copy link
Member

brandond commented Apr 21, 2023

can be built-in by default without requiring a build tag

I'm not sure what you mean by this, but I am assuming you're talking about building K3s with or without the build tag to enable the embedded server?

I haven't run it by the rest of the team and our PM on the K3s side just yet. I think for Kine itself we can document it as available since we'll ship it that way in the release artifacts. Since the binary size impact for k3s is not terrible I think it should be doable over there as well.

@bruth
Copy link
Contributor Author

bruth commented Apr 24, 2023

I updated the example doc and decided to use the noEmbed query param to disable use of an embedded server when the server in fact embedded. This will make the embedded build default to use an embedded server.. and a non-embedded build expect to connect to an external server.

@brandond
Copy link
Member

brandond commented Apr 24, 2023

Ah, got it - so without the embedded server, the host:port are taken to be the address of an external server, and with it, they are the address to bind to?

@bruth
Copy link
Contributor Author

bruth commented Apr 24, 2023

Yes, correct.

Copy link
Member

@brandond brandond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Do you want to squash and re-push, or should I just do it when I merge?

@bruth
Copy link
Contributor Author

bruth commented Apr 24, 2023

Feel free to squash merge and keep the issue title. Thank you!

Copy link
Member

@dereknola dereknola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@brandond brandond merged commit ce77dcc into k3s-io:master Apr 25, 2023
@bruth bruth deleted the embedded-nats branch April 26, 2023 00:46
@bruth
Copy link
Contributor Author

bruth commented Apr 26, 2023

I appreciate the review and support!

gshilei pushed a commit to gshilei/kine that referenced this pull request Oct 31, 2023
* Add support for an embedded NATS server
* Put embedded server behind build flag
   Rename various identifiers to NATS rather than JetStream, which is a subsystem of NATS.
* Refactoring
* Update NATS versions and add embedded test
* Add nats build tag for tests
* Support embedded socket connection
* Update to NATS v2.9.16
* Remove unused variables
* Add backend "jetstream://" URI support
* Remove override of port
* Add examples doc
* Remove IP and port
* Fix whitespace
* Default to embedded it available
   Fix log line for embedded.
* Change to use noEmbed to disable embedding
* Fix test script and update to 2.9.16

Signed-off-by: Byron Ruth <[email protected]>
Co-authored-by: Brad Davidson <[email protected]>
Signed-off-by: zhiyuan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants