Performance issue when run Terraform Plan with many Block Sets or NestedAttribute Sets #775

jeremmfr · 2023-06-16T08:34:21Z

Module version

v1.2.0 & v1.3.1

Description

After migrating resources of my provider jeremmfr/junos from SDK plugin to this framework plugin, the time to run a terraform plan with a config that have many block sets has been increased very significantly.

For 1000 items of block set, a terraform plan (with empty state) with provider including module sdk/v2 take ~1.3 seconds and take ~ 70 seconds with provider including plugin-framework.
A user of my provider has a time of 19 minutes to run terraform plan with the new version of the provider including plugin-framework instead of 52 seconds with the previous version of the provider including module sdk/v2

Tests

I tested to remove all PlanModifiers and Validators fields on each attribute, and ValidateConfig function for resources, using the latest version of this module (v1.3.1) but there's no change.
The problem disappears if replace schema.SetNestedBlock with schema.ListNestedBlock: 70 seconds to 1 second.

I reproduce the problem with provider hashicups and hashicups_order resources.
I tested 1000 blocksitems on a resource hashicups_order with different modifications of items block.

hashicorp/terraform-provider-hashicups schema.TypeSet : 3.8 - 4 seconds
hashicorp/terraform-provider-hashicups schema.TypeList : 0.5 - 0.8 seconds
hashicorp/terraform-provider-hashicups-pf schema.SetNestedBlock : 49.1 - 49.7 seconds
hashicorp/terraform-provider-hashicups-pf schema.ListNestedBlock : 0.8 - 1.1 seconds
hashicorp/terraform-provider-hashicups-pf schema.SetNestedAttribute : 49.0 - 49.4 seconds
hashicorp/terraform-provider-hashicups-pf schema.ListNestedAttribute : 0.7 - 0.9 seconds

Relevant provider source code

To test with block sets, clone hashicorp/terraform-provider-hashicups-pf and apply this patch :

diff --git a/internal/provider/order_resource.go b/internal/provider/order_resource.go
index 4fe16cc..67203cd 100644
--- a/internal/provider/order_resource.go
+++ b/internal/provider/order_resource.go
@@ -76,10 +76,12 @@ func (r *orderResource) Schema(_ context.Context, _ resource.SchemaRequest, resp
 				Description: "Timestamp of the last Terraform update of the order.",
 				Computed:    true,
 			},
-			"items": schema.ListNestedAttribute{
+		},
+		Blocks: map[string]schema.Block{
+			"items": schema.SetNestedBlock{
 				Description: "List of items in the order.",
-				Required:    true,
-				NestedObject: schema.NestedAttributeObject{
+				//	Required:    true,
+				NestedObject: schema.NestedBlockObject{
 					Attributes: map[string]schema.Attribute{
 						"quantity": schema.Int64Attribute{
 							Description: "Count of this item in the order.",
@@ -87,7 +89,7 @@ func (r *orderResource) Schema(_ context.Context, _ resource.SchemaRequest, resp
 						},
 						"coffee": schema.SingleNestedAttribute{
 							Description: "Coffee item in the order.",
-							Required:    true,
+							Optional:    true,
 							Attributes: map[string]schema.Attribute{
 								"id": schema.Int64Attribute{
 									Description: "Numeric identifier of the coffee.",

& disable hashicups.NewClient code in provider configure function

To test with nested atttribute sets, clone hashicorp/terraform-provider-hashicups-pf and apply this patch :

diff --git a/internal/provider/order_resource.go b/internal/provider/order_resource.go
index 4fe16cc..ea229a7 100644
--- a/internal/provider/order_resource.go
+++ b/internal/provider/order_resource.go
@@ -76,7 +76,7 @@ func (r *orderResource) Schema(_ context.Context, _ resource.SchemaRequest, resp
 				Description: "Timestamp of the last Terraform update of the order.",
 				Computed:    true,
 			},
-			"items": schema.ListNestedAttribute{
+			"items": schema.SetNestedAttribute{
 				Description: "List of items in the order.",
 				Required:    true,
 				NestedObject: schema.NestedAttributeObject{
@@ -87,7 +87,7 @@ func (r *orderResource) Schema(_ context.Context, _ resource.SchemaRequest, resp
 						},
 						"coffee": schema.SingleNestedAttribute{
 							Description: "Coffee item in the order.",
-							Required:    true,
+							Optional:    true,
 							Attributes: map[string]schema.Attribute{
 								"id": schema.Int64Attribute{
 									Description: "Numeric identifier of the coffee.",

& disable hashicups.NewClient code in provider configure function

Terraform Configuration Files

To test with block sets:

resource "hashicups_order" "example" {
  dynamic "items" {
    for_each = range(1, 1000)

    content {
      quantity = items.value
    }
  }
}

To test with nested attribute sets:

resource "hashicups_order" "example2" {
  items = [
    for e in range(1, 1000) : {
      quantity = e
    }
  ]
}

Expected Behavior

The terraform plan command takes a reasonable time to run when there are many block sets.

Actual Behavior

The terraform plan command takes a very long time to run when there are many block sets.

Steps to Reproduce

build hashicups provider
terraform init
terraform plan

References

jeremmfr/terraform-provider-junos#498

The text was updated successfully, but these errors were encountered:

bflad · 2023-06-16T20:37:12Z

Hi @jeremmfr 👋 Thank you for reporting this issue and sorry for the trouble here. I just wanted to acknowledge that the maintainers have noticed this bug report and do plan on taking a further look into this, but that effort will probably happen early next week. In the meantime, if you have any familiarity with pprof and flamegraph profiling, that might be a good methodology for starting to narrow down where the unexpected extra time is being spent.

jeremmfr · 2023-06-17T09:42:21Z

Hi,

I've run profiling with pprof, the provider hashicups, attribute items with type schema.SetNestedBlock, and a resource hashicups_order with 1000 elements of dynamic items.

I've also updated all the dependencies for the provider hashicups :

 require (
 	github.com/hashicorp-demoapp/hashicups-client-go v0.1.0
-	github.com/hashicorp/terraform-plugin-docs v0.14.1
-	github.com/hashicorp/terraform-plugin-framework v1.2.0
-	github.com/hashicorp/terraform-plugin-go v0.15.0
-	github.com/hashicorp/terraform-plugin-log v0.8.0
-	github.com/hashicorp/terraform-plugin-testing v1.2.0
+	github.com/hashicorp/terraform-plugin-docs v0.15.0
+	github.com/hashicorp/terraform-plugin-framework v1.3.1
+	github.com/hashicorp/terraform-plugin-go v0.16.0
+	github.com/hashicorp/terraform-plugin-log v0.9.0
+	github.com/hashicorp/terraform-plugin-testing v1.3.0
 )

I attach a png and a flamegrah (flamegraph in gist cpu.svg)

jeremmfr · 2023-06-17T13:52:18Z

A profiling result for a complete call of PlanResourceChange and ValidateResourceConfig instead of 30 seconds dump.
(flamegraph in gist cpu2.svg)

jeremmfr · 2023-06-17T14:09:20Z

A profiling result for a complete call of ValidateResourceConfig (~7 seconds)
(flamegraph in gist cpu3.svg)

bflad · 2023-06-30T20:01:57Z

Hi @jeremmfr 👋 Would you be able to try out hashicorp/terraform-plugin-go#308? From my local testing with the hashicups-pf example, it reduced the plan time by over half. I'm curious what level of real world benefit you might see as well.

jeremmfr · 2023-07-01T17:06:38Z

Hi @bflad 👋

We launch multiple scenarios to compare potential benefit.

The PR reduce time but there is always a gap between SDKv2 and framework version of provider.

The result:

in seconds	SDKv2	Framework	Framework with terraform-plugin-go#308
`hashicups_order` with 1000 `items` block	~ 3.8	~ 48	~ 14
`junos_security_address_book` with 1000 `network_address` block	~ 1.3	~ 78	~ 17
`junos_security_address_book` with 1000 `address_set` block	~ 3.2	~ 198	~ 45

The difference between network_address block and address_set block is that network_address block has only primitive attributes (String) and address_set block has primitive attributes (String) and primitive set attributes (String sets)

Terraform code

hashicups_order with items

resource "hashicups_order" "example" {
  dynamic "items" {
    for_each = range(1, 1000)
    content {
      quantity = items.value
    }
  }
}

junos_security_address_book with network_address

resource "junos_security_address_book" "global" {
  name = "global"

  dynamic "network_address" {
    for_each = range(1, 1000)
    content {
      name  = "test${network_address.value}"
      value = "1.1.1.1/32"
    }
  }
}

junos_security_address_book with address_set

resource "junos_security_address_book" "global" {
  name = "global"

  dynamic "address_set" {
    for_each = range(1, 1000)
    content {
      name    = "test${address_set.value}"
      address = ["baz"]
    }
  }
}

Go replace

in go.mod to use PR terraform-plugin-go#308

replace github.com/hashicorp/terraform-plugin-go => github.com/hashicorp/terraform-plugin-go v0.17.1-0.20230630191346-2155f0b1810f

I run new profiling with pprof in last scenario (junos_security_address_book with 1000 address_set block and framework with terraform-plugin-go#308)

The result in png and flamegraph (in gist)

bflad · 2023-07-03T16:16:10Z

Thanks for the confirmation and the additional context, @jeremmfr. 😄 Given the overall very positive effects, I've pulled in the upstream changes and will cut a release of those shortly.

I'm guessing there are two threads to pull on next with the framework side of things:

Why logging via terraform-plugin-log and go-hclog is so expensive with syscalls. Maybe something can be done to alleviate that compute pressure, such as ensuring that each logging call is not resulting in a flush (or seeing if that behavior can be configured).
Investigating the reduction of allocations for path information. One challenge there is that paths are implemented as a slice of path steps, which in Go means that even when paths are passed by reference that the underlying steps data is mutable unless the implementation is very careful to try and prevent that possibility.

One other reality is that the framework went GA with its data handling half in terraform-plugin-go and half in terraform-plugin-framework. There is a lot of passing back and forth between the type systems in the PlanResourceChange logic. Ideally, everything would have been done in terraform-plugin-framework and only using terraform-plugin-go's type system for the request/response handling. It's unfortunately very difficult to change that until a major version of the framework to remove the directly exposed terraform-plugin-go types in certain places.

…rmValue() Reference: #775 The prior implementation used the exported `AttributeTypes()` method, which in this case was unnecessarily creating a copy of the map when it was only be used for walking the existing data. At scale, such as handling 1000+ objects, this creates unnecessary pressure on the Go garbage collector. `benchstat` comparison with new benchmark test: ```text goos: darwin goarch: arm64 pkg: github.com/hashicorp/terraform-plugin-framework/types/basetypes │ original │ proposed │ │ sec/op │ sec/op vs base │ ObjectValueToTerraformValue1000-10 408.4µ ± 1% 318.4µ ± 1% -22.02% (p=0.000 n=10) │ original │ proposed │ │ B/op │ B/op vs base │ ObjectValueToTerraformValue1000-10 449.2Ki ± 0% 286.6Ki ± 0% -36.19% (p=0.000 n=10) │ original │ proposed │ │ allocs/op │ allocs/op vs base │ ObjectValueToTerraformValue1000-10 2.051k ± 0% 2.020k ± 0% -1.51% (p=0.000 n=10) ```

…rmValue() (#791) Reference: #775 The prior implementation used the exported `AttributeTypes()` method, which in this case was unnecessarily creating a copy of the map when it was only be used for walking the existing data. At scale, such as handling 1000+ objects, this creates unnecessary pressure on the Go garbage collector. `benchstat` comparison with new benchmark test: ```text goos: darwin goarch: arm64 pkg: github.com/hashicorp/terraform-plugin-framework/types/basetypes │ original │ proposed │ │ sec/op │ sec/op vs base │ ObjectValueToTerraformValue1000-10 408.4µ ± 1% 318.4µ ± 1% -22.02% (p=0.000 n=10) │ original │ proposed │ │ B/op │ B/op vs base │ ObjectValueToTerraformValue1000-10 449.2Ki ± 0% 286.6Ki ± 0% -36.19% (p=0.000 n=10) │ original │ proposed │ │ allocs/op │ allocs/op vs base │ ObjectValueToTerraformValue1000-10 2.051k ± 0% 2.020k ± 0% -1.51% (p=0.000 n=10) ```

bflad · 2023-07-05T15:54:18Z

@jeremmfr if you have a moment, I'm curious if 9fe2ca5 (which should include terraform-plugin-go#308) offers some slight benefits with your provider. I'm guessing it'll be fairly minimal, but it was low hanging fruit to prevent unnecessary memory allocation and cleanup work for the runtime.

jeremmfr · 2023-07-06T07:13:04Z

Hi @bflad , I ran the same previous scenarios with the PR #791, but the difference is not visible on the Plan time.

alexhung · 2023-09-20T22:21:28Z

@bflad Any update on this issue? Or any suggestion on workaround?

jeremmfr added the bug Something isn't working label Jun 16, 2023

alexhung mentioned this issue Jun 28, 2023

Getting plugin did not respond errors with v8.2.1 jfrog/terraform-provider-artifactory#757

Closed

5 tasks

bflad mentioned this issue Jun 29, 2023

tftypes: Remove Extraneous Memory Allocations in ApplyTerraform5AttributePathStep hashicorp/terraform-plugin-go#307

Closed

bflad mentioned this issue Jul 3, 2023

types/basetypes: Reduce memory allocations of (ObjectValue).ToTerraformValue() #791

Merged

jeremmfr mentioned this issue Sep 26, 2024

Performance: ressource junos_applications quite slow jeremmfr/terraform-provider-junos#709

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issue when run Terraform Plan with many Block Sets or NestedAttribute Sets #775

Performance issue when run Terraform Plan with many Block Sets or NestedAttribute Sets #775

jeremmfr commented Jun 16, 2023 •

edited

Loading

bflad commented Jun 16, 2023

jeremmfr commented Jun 17, 2023 •

edited

Loading

jeremmfr commented Jun 17, 2023 •

edited

Loading

jeremmfr commented Jun 17, 2023 •

edited

Loading

bflad commented Jun 30, 2023

jeremmfr commented Jul 1, 2023 •

edited

Loading

bflad commented Jul 3, 2023

bflad commented Jul 5, 2023

jeremmfr commented Jul 6, 2023

alexhung commented Sep 20, 2023

Performance issue when run Terraform Plan with many Block Sets or NestedAttribute Sets #775

Performance issue when run Terraform Plan with many Block Sets or NestedAttribute Sets #775

Comments

jeremmfr commented Jun 16, 2023 • edited Loading

Module version

Description

Tests

Relevant provider source code

Terraform Configuration Files

Expected Behavior

Actual Behavior

Steps to Reproduce

References

bflad commented Jun 16, 2023

jeremmfr commented Jun 17, 2023 • edited Loading

jeremmfr commented Jun 17, 2023 • edited Loading

jeremmfr commented Jun 17, 2023 • edited Loading

bflad commented Jun 30, 2023

jeremmfr commented Jul 1, 2023 • edited Loading

bflad commented Jul 3, 2023

bflad commented Jul 5, 2023

jeremmfr commented Jul 6, 2023

alexhung commented Sep 20, 2023

jeremmfr commented Jun 16, 2023 •

edited

Loading

jeremmfr commented Jun 17, 2023 •

edited

Loading

jeremmfr commented Jun 17, 2023 •

edited

Loading

jeremmfr commented Jun 17, 2023 •

edited

Loading

jeremmfr commented Jul 1, 2023 •

edited

Loading