Update guide migration guide

MaterializeInc · Sep 16, 2024 · 8e22282 · 8e22282
1 parent b0362a1
commit 8e22282
Show file tree

Hide file tree

Showing 2 changed files with 220 additions and 154 deletions.
diff --git a/docs/guides/materialize_source_table.md b/docs/guides/materialize_source_table.md
@@ -1,10 +1,3 @@
----
-page_title: "Source versioning: migrating to `materialize_source_table` Resource"
-subcategory: ""
-description: |-
-
----
-
 # Source versioning: migrating to `materialize_source_table_{source}` Resource
 
 In previous versions of the Materialize Terraform provider, source tables were defined within the source resource itself and were considered subsources of the source rather than separate entities.
@@ -36,19 +29,42 @@ resource "materialize_source_mysql" "mysql_source" {
 }
 ```
 
-The same approach was used for other source types such as Postgres and the load generator sources.
+### Example: Kafka Source
+
+```hcl
+resource "materialize_source_kafka" "example_source_kafka_format_text" {
+  name         = "source_kafka_text"
+  comment      = "source kafka comment"
+  cluster_name = materialize_cluster.cluster_source.name
+  topic        = "topic1"
+
+  kafka_connection {
+    name          = materialize_connection_kafka.kafka_connection.name
+    schema_name   = materialize_connection_kafka.kafka_connection.schema_name
+    database_name = materialize_connection_kafka.kafka_connection.database_name
+  }
+  key_format {
+    text = true
+  }
+  value_format {
+    text = true
+  }
+}
+```
 
 ## New Approach
 
-The new approach separates source definitions and table definitions. You will now create the source without specifying the tables, and then define each table using the `materialize_source_table_mysql` resource.
+The new approach separates source definitions and table definitions. You will now create the source without specifying the tables, and then define each table using the `materialize_source_table_{source}` resource.
 
 ## Manual Migration Process
 
-This manual migration process requires users to create new source tables using the new `materialize_source_table_{source}` resource and then remove the old ones. In this example, we will use MySQL as the source type.
+This manual migration process requires users to create new source tables using the new `materialize_source_table_{source}` resource and then remove the old ones. We'll cover examples for both MySQL and Kafka sources.
+
+### Step 1: Define `materialize_source_table_{source}` Resources
 
-### Step 1: Define `materialize_source_table_mysql` Resources
+Before making any changes to your existing source resources, create new `materialize_source_table_{source}` resources for each table that is currently defined within your sources.
 
-Before making any changes to your existing source resources, create new `materialize_source_table_mysql` resources for each table that is currently defined within your sources. This ensures that the tables are preserved during the migration:
+#### MySQL Example:
 
 ```hcl
 resource "materialize_source_table_mysql" "mysql_table_from_source" {
@@ -68,15 +84,40 @@ resource "materialize_source_table_mysql" "mysql_table_from_source" {
 }
 ```
 
-### Step 2: Apply the Changes
+#### Kafka Example:
+
+```hcl
+resource "materialize_source_table_kafka" "kafka_table_from_source" {
+  name           = "kafka_table_from_source"
+  schema_name    = "public"
+  database_name  = "materialize"
+
+  source_name {
+    name = materialize_source_kafka.kafka_source.name
+  }
+
+  key_format {
+    text = true
+  }
 
-Run `terraform plan` and `terraform apply` to create the new `materialize_source_table_mysql` resources. This step ensures that the tables are defined separately from the source and are not removed from Materialize.
+  value_format {
+    text = true
+  }
+
+}
+```
+
+### Step 2: Apply the Changes
 
-> **Note:** This will start an ingestion process for the newly created source tables.
+Run `terraform plan` and `terraform apply` to create the new `materialize_source_table_{source}` resources.
 
 ### Step 3: Remove Table Blocks from Source Resources
 
-Once the new `materialize_source_table_mysql` resources are successfully created, you can safely remove the `table` blocks from your existing source resources:
+Once the new `materialize_source_table_{source}` resources are successfully created, remove all the deprecated and table-specific attributes from your source resources.
+
+#### MySQL Example:
+
+For MySQL sources, remove the `table` block and any table-specific attributes from the source resource:
 
 ```hcl
 resource "materialize_source_mysql" "mysql_source" {
@@ -99,96 +140,88 @@ resource "materialize_source_mysql" "mysql_source" {
 }
 ```
 
-This will drop the old tables from the source resources.
-
-### Step 4: Update Terraform State
-
-After removing the `table` blocks from your source resources, run `terraform plan` and `terraform apply` again to update the Terraform state and apply the changes.
-
-### Step 5: Verify the Migration
-
-After applying the changes, verify that your tables are still correctly set up in Materialize by checking the table definitions using Materialize’s SQL commands.
-
-During the migration, you can use both the old `table` blocks and the new `materialize_source_table_{source}` resources simultaneously. This allows for a gradual transition until the old method is fully deprecated.
-
-The same approach can be used for other source types such as Postgres, eg. `materialize_source_table_postgres`.
-
-## Automated Migration Process (TBD)
-
-> **Note:** This will still not work as the previous source tables are considered subsources of the source and are missing from the `mz_tables` table in Materialize so we can't import them directly without recreating them.
-
-Once the migration on the Materialize side has been implemented, a more automated migration process will be available. The steps will include:
-
-### Step 1: Define `materialize_source_table_{source}` Resources
+#### Kafka Example:
 
-First, define the new `materialize_source_table_mysql` resources for each table:
+For Kafka sources, remove the `format`, `include_key`, `include_headers`, and other table-specific attributes from the source resource:
 
 ```hcl
-resource "materialize_source_table_mysql" "mysql_table_from_source" {
-  name           = "mysql_table1_from_source"
-  schema_name    = "public"
-  database_name  = "materialize"
+resource "materialize_source_kafka" "kafka_source" {
+  name         = "kafka_source"
+  cluster_name = "cluster_name"
 
-  source {
-    name = materialize_source_mysql.mysql_source.name
-    // Define the schema and database for the source if needed
+  kafka_connection {
+    name = materialize_connection_kafka.kafka_connection.name
   }
 
-  upstream_name        = "mysql_table1"
-  upstream_schema_name = "shop"
+  topic = "example_topic"
 
-  ignore_columns = ["about"]
+  lifecycle {
+    ignore_changes = [
+      include_key,
+      include_headers,
+      format,
+      ...
+    ]
+  }
+  // Remove the format, include_key, include_headers, and other table-specific attributes
 }
 ```
 
-### Step 2: Modify the Existing Source Resource
-
-Next, modify the existing source resource by removing the `table` blocks and adding an `ignore_changes` directive for the `table` attribute. This prevents Terraform from trying to delete the tables:
+In the `lifecycle` block, add the `ignore_changes` meta-argument to prevent Terraform from trying to update these attributes during subsequent applies, that way Terraform won't try to update these values based on incomplete information from the state as they will no longer be defined in the source resource itself but in the new `materialize_source_table_{source}` resources.
 
-```hcl
-resource "materialize_source_mysql" "mysql_source" {
-  name         = "mysql_source"
-  cluster_name = "cluster_name"
+### Step 4: Update Terraform State
 
-  mysql_connection {
-    name = materialize_connection_mysql.mysql_connection.name
-  }
+After removing the `table` blocks and the table/topic specific attributes from your source resources, run `terraform plan` and `terraform apply` again to update the Terraform state and apply the changes.
 
-  lifecycle {
-    ignore_changes = [table]
-  }
-}
-```
+### Step 5: Verify the Migration
 
-- **`lifecycle { ignore_changes = [table] }`**: This directive tells Terraform to ignore changes to the `table` attribute, preventing it from trying to delete tables that were previously defined in the source resource.
+After applying the changes, verify that your tables are still correctly set up in Materialize by checking the table definitions using Materialize's SQL commands.
 
-### Step 3: Import the Existing Tables
+## Importing Existing Tables
 
-You can then import the existing tables into the new `materialize_source_table_mysql` resources without disrupting your existing setup:
+To import existing tables into your Terraform state, use the following command:
 
 ```bash
-terraform import materialize_source_table_mysql.mysql_table_from_source <region>:<table_id>
+terraform import materialize_source_table_{source}.table_name <region>:<table_id>
 ```
 
-Replace `<region>` with the actual region and `<table_id>` with the table ID. You can find the table ID by querying the `mz_tables` table.
+Replace `{source}` with the appropriate source type (e.g., `mysql`, `kafka`), `<region>` with the actual region, and `<table_id>` with the table ID.
 
-### Step 4: Run Terraform Plan and Apply
+### Important Note on Importing
 
-Finally, run `terraform plan` and `terraform apply` to ensure that everything is correctly set up without triggering any unwanted deletions.
+Due to limitations in the current read function, not all properties of the source tables are available when importing. To work around this, you'll need to use the `ignore_changes` lifecycle meta-argument for certain attributes that can't be read back from the state.
 
-This approach allows you to migrate your tables safely without disrupting your existing setup.
+For example, for a Kafka source table:
 
-## Importing Existing Tables
+```hcl
+resource "materialize_source_table_kafka" "kafka_table_from_source" {
+  name           = "kafka_table_from_source"
+  schema_name    = "public"
+  database_name  = "materialize"
 
-To import existing tables into your Terraform state using the manual migration process, use the following command:
+  source_name = materialize_source_kafka.kafka_source.name
 
-```bash
-terraform import materialize_source_table_mysql.table_name <region>:<table_id>
+  include_key     = true
+  include_headers = true
+
+  envelope {
+    upsert = true
+  }
+
+  lifecycle {
+    ignore_changes = [
+      include_key,
+      include_headers,
+      envelope
+      ... Add other attributes here as needed
+    ]
+  }
+}
 ```
 
-Ensure you replace `<region>` with the region where the table is located and `<table_id>` with the ID of the table.
+This `ignore_changes` block tells Terraform to ignore changes to these attributes during subsequent applies, preventing Terraform from trying to update these values based on incomplete information from the state.
 
-> **Note:** The `upstream_name` and `upstream_schema_name` attributes are not yet implemented on the Materialize side, so the import process will not work until these changes are made.
+After importing, you may need to manually update these ignored attributes in your Terraform configuration to match the actual state in Materialize.
 
 ## Future Improvements