Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate config.yaml when running sql-server #8644

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

milogreg
Copy link

@milogreg milogreg commented Dec 5, 2024

If sql-server is ran without a specified config file and there is no config.yaml in the database directory, one will be generated.

The rules for how the default config.yaml file is generated are as follows:

  • If a field has a value defined by the sql-server execution (such as through CLI args), then that value will be used.
  • If a field has no set value but has a default value, then that default will be used.
  • If a field has no set value and no default, a commented-out line setting the field to null will be included as a placeholder.

Part of #7980

@timsehn
Copy link
Contributor

timsehn commented Dec 5, 2024

Very cool!

My first comment is that it could use some bats tests. It probably should have its own test file.

Here's an example to crib from:

https://github.com/dolthub/dolt/blob/main/integration-tests/bats/sql-server.bats

Also, the bats readme can help you get started with running your tests:

https://github.com/dolthub/dolt/tree/main/integration-tests/bats#readme

@milogreg
Copy link
Author

milogreg commented Dec 9, 2024

Made some bats tests.

However, I noticed some other tests using sql-server skip when running windows due to issues I'm unaware of, so I'm not sure if the tests will behave properly on windows yet (I've only ran them on Mac).

@timsehn
Copy link
Contributor

timsehn commented Dec 9, 2024

@jycor will review this today.

@zachmu
Copy link
Member

zachmu commented Dec 9, 2024

Some of these bats tests are failing because we just removed a couple of config keys the tests are expecting. Merge main, run locally, and fix the tests.

@zachmu
Copy link
Member

zachmu commented Dec 9, 2024

Also @milogreg you should join the discord to talk in real time for faster turnaround on these issues.

Copy link
Contributor

@jycor jycor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution! I like the high level idea of this PR.
The BATS are very thorough and well tested!

However, I don't like the use of reflect for fillDefault. I think it would be better to explicitly check each field of the struct and fill accordingly.

Lastly, I'm concerned about how much longer BATS is taking. I'm not sure if it's a side effect of github actions on a PR from a fork or if the config generation is slowing it down signigicantly.

Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality looks good but see comments

go/cmd/dolt/commands/sqlserver/sqlserver.go Show resolved Hide resolved
go/cmd/dolt/commands/sqlserver/sqlserver.go Show resolved Hide resolved
go/cmd/dolt/commands/sqlserver/sqlserver.go Outdated Show resolved Hide resolved
go/libraries/doltcore/servercfg/yaml_config.go Outdated Show resolved Hide resolved
go/libraries/doltcore/servercfg/yaml_config.go Outdated Show resolved Hide resolved
go/libraries/doltcore/servercfg/yaml_config_test.go Outdated Show resolved Hide resolved
go/libraries/doltcore/servercfg/yaml_config.go Outdated Show resolved Hide resolved
@milogreg
Copy link
Author

Thanks for this contribution! I like the high level idea of this PR. The BATS are very thorough and well tested!

However, I don't like the use of reflect for fillDefault. I think it would be better to explicitly check each field of the struct and fill accordingly.

Lastly, I'm concerned about how much longer BATS is taking. I'm not sure if it's a side effect of github actions on a PR from a fork or if the config generation is slowing it down signigicantly.

I'll remove that usage of reflect, and look into potential performance issues.

I didn't put much thought into performance when designing this, as I falsely assumed that sql-server wasn't something that would be ran repeatedly from a clean state (I clearly didn't take testing into account). So, I wouldn't be surprised at all if the config generation is the cause of the slowdown.

The first thing I'll try is removing the expensive usages of reflection (aside from the ones in the YAML marshaling itself), and see how that affects the runtime (hopefully this is all it will take to make it performant). If that isn't enough, I'll try to actually nail down what the bottleneck is and redesign accordingly.

However, I don't think I know enough to determine what a reasonable target time per config generation would be, is there an approximate maximum time that would be acceptable? I'll be benchmarking it on my m3 MacBook pro.

@milogreg
Copy link
Author

From my extremely quick testing it looks like almost all of the overhead of the config generation comes from my usage of dEnv.FS.WriteFile, which is many times slower than os.WriteFile. I think I just did it like this because I saw other parts of the codebase write files in this way. Is it ok to switch to os.WriteFile, or is there another more proper way of writing files faster?

@milogreg
Copy link
Author

I looked into the dEnv.FS.WriteFile performance difference, it looks like it's significantly slower than os.WriteFile because it writes the file atomically. I think the implementation of dEnv.FS.WriteFile (WriteFileAtomically in go/libraries/utils/file/sync.go) is pretty optimal, so this overhead may be unavoidable unless we want to give up the atomic guarantee. I can't think of a major reason as to why it would be bad to leave an incomplete config.yaml in the event of a crash, so I'd assume it would be okay to use the non-atomic WriteFile, but I wont make that decision by myself. Another option, for if an atomic write isn't needed but using dEnv.FS is required, is to use dEnv.FS.OpenForWrite (which is about as performant as os.WriteFile).

@milogreg milogreg requested review from zachmu and jycor December 11, 2024 18:20
Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tests, seeing a concrete example of the output made it clear how this could be improved.

This is getting close, appreciate the effort on this.

go/libraries/doltcore/servercfg/yaml_config.go Outdated Show resolved Hide resolved
@@ -415,3 +414,33 @@ listener:
err = ValidateConfig(cfg)
assert.Error(t, err)
}

func TestCommentNullYAMLValues(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is good but this is important enough behavior we want to exercise it a little more thoroughly, and to be more end-to-end. These are nice because it lets a human reviewer see what the behavior is explicitly.

When I see it spelled out like this, I notice the behavior here isn't quite what we want. A commented-out line with value1: null tells the user there's a key called value1 but doesn't tell them what a normal value should be if they decide they want to set it. The goal is to create a self-documenting process where a customer gets a file created that they can use to learn about the configuration options and change them themselves going forward.

Better end-to-end behavior would be:

  • Every key explicitly specified via command line flags is set
  • Every other key is commented out with its default value (including those in structs)
  • Config elements that currently have no default defined need example values to server as placeholders for this use case, rather than null. You will need to define these explicitly using the docs as a guide.
  • A comment at the top of the file indicating it was generated and that users should uncomment and edit lines as necessary.

With all that in place, what I want to see is one or more Go tests like this one, but for the entire process end-to-end, i.e. "given these command line options, the following config file is produced verbatim."

@zachmu
Copy link
Member

zachmu commented Dec 18, 2024

Looks like you'll need to merge main again.

@zachmu
Copy link
Member

zachmu commented Jan 2, 2025

@milogreg do you still want to work on this? We can probably run it home for you if not. Let us know either way.

@milogreg
Copy link
Author

milogreg commented Jan 4, 2025

Thanks for checking in, I plan to work on it over the next week.

@milogreg milogreg requested a review from zachmu January 8, 2025 00:04
Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good, just minor comments.

There's one field which is internal only / undocumented, so we don't want it showing up in the generated config file. Up to you how to best address that.

const yamlConfigName = "config.yaml"

apr := cli.ParseArgsOrDie(ap, args, help)
if err := validateSqlServerArgs(apr); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant, already happens in ServerConfigFromArgs

Comment on lines 547 to 548
generatedYaml := `# This file was generated using your configuration.
# Uncomment and edit lines as necessary to modify your configuration.` + "\n\n" + yamlConfig.VerboseString()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
generatedYaml := `# This file was generated using your configuration.
# Uncomment and edit lines as necessary to modify your configuration.` + "\n\n" + yamlConfig.VerboseString()
generatedYaml := `# Dolt SQL server configuration
#
# Uncomment and edit lines as necessary to modify your configuration.
# Full documentation: https://docs.dolthub.com/sql-reference/server/configuration
#` + "\n\n" + yamlConfig.VerboseString()

}

if validatingCfg, ok := cfg.(ValidatingServerConfig); ok {
res.GoldenMysqlConn = zeroIf(ptr(validatingCfg.GoldenMysqlConnectionString()), !cfg.ValueSet(GoldenMysqlConnectionStringKey))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setting is internal only and undocumented, let's omit.

}
}
if withPlaceholders.MetricsConfig.Host == nil {
withPlaceholders.MetricsConfig.Host = ptr("123.45.67.89")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
withPlaceholders.MetricsConfig.Host = ptr("123.45.67.89")
withPlaceholders.MetricsConfig.Host = ptr("localhost")


if withPlaceholders.MetricsConfig.Labels == nil {
withPlaceholders.MetricsConfig.Labels = map[string]string{
"label1": "value1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default for this should be an empty map (docs are confusing, it's a very advanced use case)

}
}

if withPlaceholders.GoldenMysqlConn == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely omit this from the final output

}
}

if withPlaceholders.Jwks == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default for this field should be empty (it's advanced and poorly documented, currently only used for our hosting service)

MetricsConfig MetricsYAMLConfig `yaml:"metrics"`
RemotesapiConfig RemotesapiYAMLConfig `yaml:"remotesapi"`
MetricsConfig MetricsYAMLConfig `yaml:"metrics,omitempty"`
RemotesapiConfig RemotesapiYAMLConfig `yaml:"remotesapi,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want ClusterConfig and MetricsConfig and Jwks the bottom of the file, since they are only for advanced users. Since the yaml marshaler outputs in struct field order, that means we should move these fields to the end of the struct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants