Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] table resource #242

Merged
merged 21 commits into from
Oct 7, 2020

Conversation

johnpeterharvey
Copy link
Contributor

@johnpeterharvey johnpeterharvey commented Aug 20, 2020

Summary

Resolves #123
Addition of extra resource type snowflake_table, to enable terraform creation of tables in Snowflake.

The implementation is simple, allowing for the most basic creation. Can be extended with support for things like DROP CASCADE etc, but needed the basic table creation ability to be able to terraform out environments for our project.

Example usage:

resource "snowflake_table" "table" {
  database = "FULFILMENT"
  schema = "PUBLIC"
  name = "test"
  column {
    name = "column1"
    type = "VARCHAR"
  }
  column {
    name = "column2"
    type = "VARIANT"
  }
}

Output attribute data is currently only the table owner.

Test Plan

Have added unit tests and base acceptance test in line with the test cases for other similar resources.

References

Used https://docs.snowflake.com/en/sql-reference/sql/create-table.html for reference.

Thanks for this provider, we're finding it really useful! 🙂

@johnpeterharvey johnpeterharvey requested a review from a team as a code owner August 20, 2020 20:35
@johnpeterharvey johnpeterharvey requested review from alldoami and removed request for a team August 20, 2020 20:35
@johnpeterharvey johnpeterharvey changed the title Issue 123 table Table resource Aug 20, 2020
@robbruce
Copy link
Contributor

@johnpeterharvey I'd be very cautious about having ForceNew applied to columns, when an a change occurs then any existing data in the table would cause it to be dropped; if the data retention days is set to zero then it would cause problems.

I would say that there should be some sort of option for how to apply changes:

  • Drop and Re-Create (like you have now)
  • Append new columns to the end of the table
  • CTAS (Create Table as Select) approach; create the table under a temp name and then drop the existing followed by a rename of the table to the original name

There will likely need to be some validation on data type changes too, as not all data types can be converted into a different type.

Would be nice if the other elements are covered too:

  • Column default values, including identity options
  • Clustering Keys
  • Primary Keys
  • Unique Constraints
  • Foreign Keys
  • Data Retention Days
  • Transient Table boolean flag
  • Change Tracking boolean flag

I know that that unique and foreign keys are not used, but a lot of reporting solutions use the metadata on the table to discover relationships in the schema.

Happy to collaborate on this if you need assistance.

@ryanking
Copy link
Contributor

@johnpeterharvey thanks for this!

I haven't had a chance to read this PR in detail yet, but I notice that there are no acceptance tests.

Though other resources lack such tests, I don't think we can take PRs for new resources which don't have them. It makes maintenance too hard.

@johnpeterharvey
Copy link
Contributor Author

@johnpeterharvey thanks for this!

I haven't had a chance to read this PR in detail yet, but I notice that there are no acceptance tests.

Though other resources lack such tests, I don't think we can take PRs for new resources which don't have them. It makes maintenance too hard.

Yep, that makes sense. I've added a basic acceptance test, mimicking those existing for snowflake_database and snowflake_schema. It just covers the basic creation & deletion as that is basically what the resource provides at the moment, but as options are added it can be extended to test them too 👍

@johnpeterharvey
Copy link
Contributor Author

johnpeterharvey commented Aug 22, 2020

@johnpeterharvey I'd be very cautious about having ForceNew applied to columns, when an a change occurs then any existing data in the table would cause it to be dropped; if the data retention days is set to zero then it would cause problems.

I would say that there should be some sort of option for how to apply changes:

  • Drop and Re-Create (like you have now)
  • Append new columns to the end of the table
  • CTAS (Create Table as Select) approach; create the table under a temp name and then drop the existing followed by a rename of the table to the original name

There will likely need to be some validation on data type changes too, as not all data types can be converted into a different type.

Would be nice if the other elements are covered too:

  • Column default values, including identity options
  • Clustering Keys
  • Primary Keys
  • Unique Constraints
  • Foreign Keys
  • Data Retention Days
  • Transient Table boolean flag
  • Change Tracking boolean flag

I know that that unique and foreign keys are not used, but a lot of reporting solutions use the metadata on the table to discover relationships in the schema.

Happy to collaborate on this if you need assistance.

I agree that there is a risk of data loss if you change the column definitions and then agree on the apply plan to drop and recreate your table. However, the same risk applies if you rename the table, change the schema, and so on.

For ways to add safety, I'd be hesitant to append the new columns to the end of the table, as it'll lead to a divergence between the state as shown in terraform, and the actual state of the resource - the actual resource would be a superset of the past and present terraform.

The create under temporary name could be neat, maybe renaming rather than dropping the existing table to preserve any data contained within.

Also agree that supporting all the flags and parameters would be good. Other resources also just support a minimal parameter set, however if there are any that you'd find particularly useful it'd be great if you want to add support for them. I suspect that as we have a business intelligence team starting to look at Snowflake in the coming months, I'll be coming back with further PRs to add functionality as they request things, but unfortunately can't devote too much time to it at the moment.

I think my use case is similar to the basic use case described in the issue - we're not yet live, so I'm currently lacking guidance as to what our business intelligence team will need, but we want to start flowing data into Snowflake to have that integration done. That comes with wanting to be able to terraform up and tear down arbitrary environments as we go towards launch (with their own Snowflake tables and ingest pipes), but without the ability to create the basic table resources, we're missing the piece of the puzzle that would allow us to do this.

@robbruce
Copy link
Contributor

robbruce commented Aug 24, 2020

@johnpeterharvey

I can understand that right now you're trying to get to an MVP point, but you could maybe change your schema so that this resource is extensible? Using a schema.TypeMap could be quite restrictive.

Maybe change columns to be be something like; the options for defaults, identity and column comments have somewhere to go.

"column": {
	Type:     schema.TypeList,
	Required: true,
	MinItems: 1,
	ForceNew: true,
	Elem: &schema.Resource{
		Schema: map[string]*schema.Schema{
			"name": {
				Type:         schema.TypeString,
				Required:     true,
			},

			"type": {
				Type:         schema.TypeString,
				Required:     true,
			},
		},
	},
},

A schema.TypeList is the only collection type which is ordered, so using that will ensure parity between the declared order and what's in Snowflake.

@johnpeterharvey
Copy link
Contributor Author

@johnpeterharvey

I can understand that right now your trying to get to an MVP point, but you could maybe change your schema so that this resource is extensible?

Maybe change columns to be be

"column": {
	Type:     schema.TypeList,
	Required: true,
	MinItems: 1,
	ForceNew: true,
	Elem: &schema.Resource{
		Schema: map[string]*schema.Schema{
			"name": {
				Type:         schema.TypeString,
				Required:     true,
			},

			"type": {
				Type:         schema.TypeString,
				Required:     true,
			},
		},
	},
},

So that attributes can be added onto a column in the future, using a schema.TypeMap could be quite restrictive. A schema.TypeList is the only collection type which is ordered, so using that will ensure parity between the declared order and what's in Snowflake.

Ah, thank you! Yes, that sounds like a good strategy. Appreciate the suggestion - I'm new to Golang and still learning a lot! I'll try to make that change today and update the merge request.

@robbruce
Copy link
Contributor

@alldoami I have a number enhancements to add to this resource, but as it's not in master currently would be tricky to issue a pull request. Do you have an indication as to when this might be reviewed?

Just set some context, our company wrote a terraform provider for snowflake but was planning on making open source once it was ready. However, after reviewing your provider we opted not continue development and contribute to this instead.

We have identified functionality that we have coded already that we are planning on converting, but for a table resource we would need this merging first (into master or another branch). All of the features listed above we have written for previously.

@ryanking ryanking changed the title Table resource [feature] table resource Aug 25, 2020
pkg/resources/table.go Outdated Show resolved Hide resolved
pkg/snowflake/table.go Outdated Show resolved Hide resolved
@ryanking
Copy link
Contributor

When ready for review again, please click the "re-request review" button.

@johnpeterharvey
Copy link
Contributor Author

Sorry for the lack of action on this the last few weeks, had less time to revisit than I had planned. Still not really sure where to go with alterations. @ryanking , would you be able to give any guidance on my reply to your change request above please? Thanks!

@brucedvgw
Copy link

Keen to see this come through.

// UpdateTable implements schema.UpdateFunc
func UpdateTable(data *schema.ResourceData, meta interface{}) error {
// https://www.terraform.io/docs/extend/writing-custom-providers.html#error-handling-amp-partial-state
data.Partial(true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this goes in after #262, we need to remove partial usage.

@ryanking ryanking merged commit 70bf442 into Snowflake-Labs:master Oct 7, 2020
@berosen berosen mentioned this pull request May 18, 2021
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add snowflake_table resource for creating and managing table objects
4 participants