Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Generate per-dialect options on init project #3733

Merged
merged 7 commits into from
Jan 31, 2025

Conversation

VaggelisD
Copy link
Contributor

Fixes #3459

Dynamically generate the engine-appropriate connection options through Pydantic's model_fields. The following conventions are applied:

  • If the Field is not required, it's commented out
  • If the Field has a (literal) default value it's appended to the RHS

@VaggelisD VaggelisD marked this pull request as draft January 28, 2025 13:37
@VaggelisD VaggelisD force-pushed the vaggelisd/project_init branch from d3cb328 to c5414fd Compare January 28, 2025 13:38
@sungchun12
Copy link
Contributor

sungchun12 commented Jan 28, 2025

I tested this out locally and realized it's not helpful unless you explain in plain terms what each config means paired with an example. I recommend you update the template with this look and feel. You can leverage our existing docs to fill out the config descriptions.

This mainly needs to be done for: snowflake, databricks, bigquery, postgres, duckdb, redshift

What will also help is that most users if they're serious will want to read our docs to verify what connection config settings matter to them. Let's save them the extra step and print it to the terminal: https://sqlmesh.readthedocs.io/en/stable/integrations/overview/#execution-engines

Nice to have, dynamically print a link in the terminal to the exact engine connection options based on the database. Example for sqlmesh init postgres -> https://sqlmesh.readthedocs.io/en/stable/integrations/engines/postgres/#connection-options

Actually, as I was writing this. The "nice to have" path may be the best of both worlds where you add in the commented out configs and then dynamically print the link to the relevant engine as it's much easier to onboard in that flow vs. cluttering the yaml with elongated comments. Let me know what you think about the nice to have path being the need to have one.

gateways:
  local:
    connection:
      type: bigquery
      # concurrent_tasks: 1
      # register_comments: True
      # pre_ping: 
      # pretty_sql: 
      # method: BigQueryConnectionMethod.OAUTH
      # project: 
      # execution_project: # The name of the GCP project to bill for the execution of the models. If not set, the project associated with the model will be used. type: string, ex: my-project
      # quota_project: 
      # location: 
      keyfile: # Path to the keyfile to be used with service-account method # type: string, ex: '/source/keyfile.json'
      # keyfile_json: 
      # token: 
      # refresh_token: 
      # client_id: 
      # client_secret: 
      # token_uri: 
      # scopes: 
      # job_creation_timeout_seconds: 
      # job_execution_timeout_seconds: 
      # job_retries: 1
      # job_retry_deadline_seconds: 
      # priority: 
      # maximum_bytes_billed: 

@VaggelisD
Copy link
Contributor Author

VaggelisD commented Jan 28, 2025

Actually, as I was writing this. The "nice to have" path may be the best of both worlds where you add in the commented out configs and then dynamically print the link to the relevant engine as it's much easier to onboard in that flow vs. cluttering the yaml with elongated comments. Let me know what you think about the nice to have path being the need to have one.

This was what I was about to recommend too, I think the dynamic generation is needed in order to be up to date with the latest options and outputting the documentation link as in "check here for more details" should do the trick.

So, will simply keep the fields empty and/or commented out for now to maintain the existing implementation if everyone's fine with this.

@VaggelisD VaggelisD force-pushed the vaggelisd/project_init branch from c7e1221 to 6b9bd8c Compare January 29, 2025 17:44
@VaggelisD VaggelisD force-pushed the vaggelisd/project_init branch 3 times, most recently from bf43bb7 to 62f8494 Compare January 30, 2025 16:59
@VaggelisD VaggelisD force-pushed the vaggelisd/project_init branch from 62f8494 to 1bac001 Compare January 30, 2025 17:11
@VaggelisD VaggelisD marked this pull request as ready for review January 30, 2025 17:31
@VaggelisD VaggelisD requested a review from a team January 30, 2025 17:31
Copy link
Contributor

@georgesittas georgesittas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think we may need to update a few existing doc pages if this is merged, to avoid showing the outdated config template.

@georgesittas georgesittas requested a review from a team January 30, 2025 18:04
@VaggelisD VaggelisD force-pushed the vaggelisd/project_init branch from 1c84f83 to bb6be58 Compare January 30, 2025 18:32
@VaggelisD VaggelisD merged commit ef7893d into main Jan 31, 2025
21 checks passed
@VaggelisD VaggelisD deleted the vaggelisd/project_init branch January 31, 2025 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sqlmesh init <database> should dynamically generate a specific template config
4 participants