Skip to content

Commit

Permalink
Update guide migration guide
Browse files Browse the repository at this point in the history
  • Loading branch information
bobbyiliev committed Sep 16, 2024
1 parent b0362a1 commit 8e22282
Show file tree
Hide file tree
Showing 2 changed files with 220 additions and 154 deletions.
187 changes: 110 additions & 77 deletions docs/guides/materialize_source_table.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,3 @@
---
page_title: "Source versioning: migrating to `materialize_source_table` Resource"
subcategory: ""
description: |-
---

# Source versioning: migrating to `materialize_source_table_{source}` Resource

In previous versions of the Materialize Terraform provider, source tables were defined within the source resource itself and were considered subsources of the source rather than separate entities.
Expand Down Expand Up @@ -36,19 +29,42 @@ resource "materialize_source_mysql" "mysql_source" {
}
```

The same approach was used for other source types such as Postgres and the load generator sources.
### Example: Kafka Source

```hcl
resource "materialize_source_kafka" "example_source_kafka_format_text" {
name = "source_kafka_text"
comment = "source kafka comment"
cluster_name = materialize_cluster.cluster_source.name
topic = "topic1"
kafka_connection {
name = materialize_connection_kafka.kafka_connection.name
schema_name = materialize_connection_kafka.kafka_connection.schema_name
database_name = materialize_connection_kafka.kafka_connection.database_name
}
key_format {
text = true
}
value_format {
text = true
}
}
```

## New Approach

The new approach separates source definitions and table definitions. You will now create the source without specifying the tables, and then define each table using the `materialize_source_table_mysql` resource.
The new approach separates source definitions and table definitions. You will now create the source without specifying the tables, and then define each table using the `materialize_source_table_{source}` resource.

## Manual Migration Process

This manual migration process requires users to create new source tables using the new `materialize_source_table_{source}` resource and then remove the old ones. In this example, we will use MySQL as the source type.
This manual migration process requires users to create new source tables using the new `materialize_source_table_{source}` resource and then remove the old ones. We'll cover examples for both MySQL and Kafka sources.

### Step 1: Define `materialize_source_table_{source}` Resources

### Step 1: Define `materialize_source_table_mysql` Resources
Before making any changes to your existing source resources, create new `materialize_source_table_{source}` resources for each table that is currently defined within your sources.

Before making any changes to your existing source resources, create new `materialize_source_table_mysql` resources for each table that is currently defined within your sources. This ensures that the tables are preserved during the migration:
#### MySQL Example:

```hcl
resource "materialize_source_table_mysql" "mysql_table_from_source" {
Expand All @@ -68,15 +84,40 @@ resource "materialize_source_table_mysql" "mysql_table_from_source" {
}
```

### Step 2: Apply the Changes
#### Kafka Example:

```hcl
resource "materialize_source_table_kafka" "kafka_table_from_source" {
name = "kafka_table_from_source"
schema_name = "public"
database_name = "materialize"
source_name {
name = materialize_source_kafka.kafka_source.name
}
key_format {
text = true
}
Run `terraform plan` and `terraform apply` to create the new `materialize_source_table_mysql` resources. This step ensures that the tables are defined separately from the source and are not removed from Materialize.
value_format {
text = true
}
}
```

### Step 2: Apply the Changes

> **Note:** This will start an ingestion process for the newly created source tables.
Run `terraform plan` and `terraform apply` to create the new `materialize_source_table_{source}` resources.

### Step 3: Remove Table Blocks from Source Resources

Once the new `materialize_source_table_mysql` resources are successfully created, you can safely remove the `table` blocks from your existing source resources:
Once the new `materialize_source_table_{source}` resources are successfully created, remove all the deprecated and table-specific attributes from your source resources.

#### MySQL Example:

For MySQL sources, remove the `table` block and any table-specific attributes from the source resource:

```hcl
resource "materialize_source_mysql" "mysql_source" {
Expand All @@ -99,96 +140,88 @@ resource "materialize_source_mysql" "mysql_source" {
}
```

This will drop the old tables from the source resources.

### Step 4: Update Terraform State

After removing the `table` blocks from your source resources, run `terraform plan` and `terraform apply` again to update the Terraform state and apply the changes.

### Step 5: Verify the Migration

After applying the changes, verify that your tables are still correctly set up in Materialize by checking the table definitions using Materialize’s SQL commands.

During the migration, you can use both the old `table` blocks and the new `materialize_source_table_{source}` resources simultaneously. This allows for a gradual transition until the old method is fully deprecated.

The same approach can be used for other source types such as Postgres, eg. `materialize_source_table_postgres`.

## Automated Migration Process (TBD)

> **Note:** This will still not work as the previous source tables are considered subsources of the source and are missing from the `mz_tables` table in Materialize so we can't import them directly without recreating them.
Once the migration on the Materialize side has been implemented, a more automated migration process will be available. The steps will include:

### Step 1: Define `materialize_source_table_{source}` Resources
#### Kafka Example:

First, define the new `materialize_source_table_mysql` resources for each table:
For Kafka sources, remove the `format`, `include_key`, `include_headers`, and other table-specific attributes from the source resource:

```hcl
resource "materialize_source_table_mysql" "mysql_table_from_source" {
name = "mysql_table1_from_source"
schema_name = "public"
database_name = "materialize"
resource "materialize_source_kafka" "kafka_source" {
name = "kafka_source"
cluster_name = "cluster_name"
source {
name = materialize_source_mysql.mysql_source.name
// Define the schema and database for the source if needed
kafka_connection {
name = materialize_connection_kafka.kafka_connection.name
}
upstream_name = "mysql_table1"
upstream_schema_name = "shop"
topic = "example_topic"
ignore_columns = ["about"]
lifecycle {
ignore_changes = [
include_key,
include_headers,
format,
...
]
}
// Remove the format, include_key, include_headers, and other table-specific attributes
}
```

### Step 2: Modify the Existing Source Resource

Next, modify the existing source resource by removing the `table` blocks and adding an `ignore_changes` directive for the `table` attribute. This prevents Terraform from trying to delete the tables:
In the `lifecycle` block, add the `ignore_changes` meta-argument to prevent Terraform from trying to update these attributes during subsequent applies, that way Terraform won't try to update these values based on incomplete information from the state as they will no longer be defined in the source resource itself but in the new `materialize_source_table_{source}` resources.

```hcl
resource "materialize_source_mysql" "mysql_source" {
name = "mysql_source"
cluster_name = "cluster_name"
### Step 4: Update Terraform State

mysql_connection {
name = materialize_connection_mysql.mysql_connection.name
}
After removing the `table` blocks and the table/topic specific attributes from your source resources, run `terraform plan` and `terraform apply` again to update the Terraform state and apply the changes.

lifecycle {
ignore_changes = [table]
}
}
```
### Step 5: Verify the Migration

- **`lifecycle { ignore_changes = [table] }`**: This directive tells Terraform to ignore changes to the `table` attribute, preventing it from trying to delete tables that were previously defined in the source resource.
After applying the changes, verify that your tables are still correctly set up in Materialize by checking the table definitions using Materialize's SQL commands.

### Step 3: Import the Existing Tables
## Importing Existing Tables

You can then import the existing tables into the new `materialize_source_table_mysql` resources without disrupting your existing setup:
To import existing tables into your Terraform state, use the following command:

```bash
terraform import materialize_source_table_mysql.mysql_table_from_source <region>:<table_id>
terraform import materialize_source_table_{source}.table_name <region>:<table_id>
```

Replace `<region>` with the actual region and `<table_id>` with the table ID. You can find the table ID by querying the `mz_tables` table.
Replace `{source}` with the appropriate source type (e.g., `mysql`, `kafka`), `<region>` with the actual region, and `<table_id>` with the table ID.

### Step 4: Run Terraform Plan and Apply
### Important Note on Importing

Finally, run `terraform plan` and `terraform apply` to ensure that everything is correctly set up without triggering any unwanted deletions.
Due to limitations in the current read function, not all properties of the source tables are available when importing. To work around this, you'll need to use the `ignore_changes` lifecycle meta-argument for certain attributes that can't be read back from the state.

This approach allows you to migrate your tables safely without disrupting your existing setup.
For example, for a Kafka source table:

## Importing Existing Tables
```hcl
resource "materialize_source_table_kafka" "kafka_table_from_source" {
name = "kafka_table_from_source"
schema_name = "public"
database_name = "materialize"
To import existing tables into your Terraform state using the manual migration process, use the following command:
source_name = materialize_source_kafka.kafka_source.name
```bash
terraform import materialize_source_table_mysql.table_name <region>:<table_id>
include_key = true
include_headers = true
envelope {
upsert = true
}
lifecycle {
ignore_changes = [
include_key,
include_headers,
envelope
... Add other attributes here as needed
]
}
}
```

Ensure you replace `<region>` with the region where the table is located and `<table_id>` with the ID of the table.
This `ignore_changes` block tells Terraform to ignore changes to these attributes during subsequent applies, preventing Terraform from trying to update these values based on incomplete information from the state.

> **Note:** The `upstream_name` and `upstream_schema_name` attributes are not yet implemented on the Materialize side, so the import process will not work until these changes are made.
After importing, you may need to manually update these ignored attributes in your Terraform configuration to match the actual state in Materialize.

## Future Improvements

Expand Down
Loading

0 comments on commit 8e22282

Please sign in to comment.