-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: rounding large int values from state #655
bug: rounding large int values from state #655
Comments
Hi @slevenick! Thanks for reporting this. Internally within Terraform itself, numbers are For that reason, my initial guess here would be that something in the Terraform SDK is round-tripping these values through a A little digging led me to a direct I've not proven that yet, but if that is the cause then this is something we'll need to tackle in the SDK repository rather than in this repository. I think to be robust there the SDK logic would need to decode into a |
A further thing I intended to mention in the previous comment: if this is indeed an SDK bug then the Terraform Core version shouldn't make any difference here. Instead, upgrading the provider would be the most likely trigger, because the provider might've introduced a new schema version for |
Looking in the implementation of |
Wild guess here, but is it possible that 0.14 is running UpgradeResourceState more frequently than 0.13 did? I've heard some reports that I've not been able to substantiate that providers that do not implement UpgradeResourceState and don't have any state upgrades in place run into problems with plugin-go, and I suppose one (again, reiterating that I'm guessing) plausible explanation is that 0.14 calls UpgradeResourceState even when the state does not need upgrading sometimes, which I think would explain how this showed up in 0.14 but not 0.13. terraform-plugin-go uses json.Number and big.Float everywhere, I believe. terraform-plugin-sdk is a mess of different number types and that has bitten us before, so it would not surprise me that we're using a float64 where it's not optimal to be. The engineering lift to fix it wholesale is, unfortunately, pretty heavy because of compatibility concerns, but we may be able to get away with it. We can transfer this issue to the SDK repo, we can leave it here until we ascertain why it suddenly showed up in 0.14, or we can transfer to the SDK and open an issue here to get to the bottom of what changed in 0.14 to prompt it, I'm fine with any of these approaches. |
|
Going to move this to the SDK, I believe I've replicated this in |
@slevenick as a workaround for now, if you have any numbers that go over 64 bits (or approach 64 bits) I'd recommend strings in the meantime. |
I don't know if it is the same issue but with 0.14 I am getting these errors for a GKE cluster:
and plan/apply fails. 13.5 works fine. |
That would be surprising, and a different but potentially related issue. Those values are strings and shouldn't have the rounding problem that we are seeing with large ints. This looks like the GKE version has changed, but I'm not sure why that would be different between 0.14 and 0.13x. When you plan/apply on 0.13 does it show any changes? Can you include debug logs for the failing plan? And potentially values in state/config for the |
@cagataygurturk if you are using |
@paultyng @slevenick What is the plan to fix this? Can this be reverted to 0.13 behaviour? |
I believe our current understanding is this is the 0.13 behavior, though I have no compelling answer as to why it suddenly became an issue. I'm hoping to get a fix out in the next few days, if possible. Providers would then need to update their SDK versions and ship new releases. |
I was able to verify this in 0.12.29 (the oldest I tested): PS C:\Users\paul\Sandbox\bigints> terraform version
Terraform v0.12.29
+ provider.random v3.0.0
Your version of Terraform is out of date! The latest version
is 0.14.2. You can update by downloading from https://www.terraform.io/downloads.html
PS C:\Users\paul\Sandbox\bigints> cat main.tf
resource "random_integer" "bigint" {
min = 7227701560655103597
max = 7227701560655103598
}
PS C:\Users\paul\Sandbox\bigints> terraform apply
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# random_integer.bigint will be created
+ resource "random_integer" "bigint" {
+ id = (known after apply)
+ max = 7227701560655103598
+ min = 7227701560655103597
+ result = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
random_integer.bigint: Creating...
random_integer.bigint: Creation complete after 0s [id=7227701560655103598]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
PS C:\Users\paul\Sandbox\bigints> terraform apply
random_integer.bigint: Refreshing state... [id=7227701560655103598]
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement
Terraform will perform the following actions:
# random_integer.bigint must be replaced
-/+ resource "random_integer" "bigint" {
~ id = "7227701560655103598" -> (known after apply)
~ max = 7227701560655104000 -> 7227701560655103598 # forces replacement
~ min = 7227701560655104000 -> 7227701560655103597 # forces replacement
~ result = 7227701560655104000 -> (known after apply)
}
Plan: 1 to add, 0 to change, 1 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
random_integer.bigint: Destroying... [id=7227701560655103598]
random_integer.bigint: Destruction complete after 0s
random_integer.bigint: Creating...
random_integer.bigint: Creation complete after 0s [id=7227701560655103598]
Apply complete! Resources: 1 added, 0 changed, 1 destroyed.
PS C:\Users\paul\Sandbox\bigints> |
See hashicorp/terraform-plugin-sdk#655. Essentially, test that integers that do not fit cleanly in float64s still retain their precision. These tests currently fail.
OK so I've done a lot of digging into this. So fundamentally, the cause of the issue here was the liberal use of JSON encoding and decoding that the SDK does. Why we do this is a longer story and requires more context, but fundamentally JSON is used as sort of a babel-representation of different systems and so when converting between, e.g., 0.11 and 0.12's types, sometimes we just encode to JSON and decode back out. Unfortunately, it is not always the case that our decoding uses the This is, usually, fine. However, for sufficiently large numbers (I believe the ceiling is 17 significant figures, but could be wrong about that) float64 can't actually represent them with precision, so it... silently rounds them. Whoops. This is all basically what Martin said a week ago. The fix here is straightforward technically, but messy when people get involved. We can just switch to using a JSON decoder with There are a couple of compatibility concerns, though. First, the use of Second, the All of this to say: the problem is known (I still don't particularly understand why 0.14 exacerbated this; or if it didn't, as seems plausible given we have multiple reproductions of it in 0.12 and 0.13, why it hasn't been an issue before now) and we're making progress on solutions, fixing this just involves moving some pieces that our ecosystem relies directly on, and so just like replacing a load-bearing pillar, we're going to need to move carefully and that takes a little bit of time. But we are working on it. |
Hi, |
Go's JSON decoder, by default, uses a lossy conversion of JSON integers to float64s. For sufficiently large integers, this yields a loss of precision, and that causes problems with diffs and plans not matching. See #655 for more details. This PR proposes a solution: keeping the lossless json.Number representation of integers in state files. This will ensure that no precision is lost, but it comes at a cost. json.Number is a string type, not a float64 type. That means that existing state upgraders that (correctly) cast a value to float64 will break, as the value will now be surfaced to them as a json.Number, which cannot be cast to a float64. To handle this, the schema.Resource type gains a `UseJSONNumber` property. When set to new, users are opted into the new json.Number values. When left false, the default, users get the existing float64 behavior. If a resource with state upgraders wants to use the new json.Number behavior, it must update all the state upgraders for that resource to use json.Number instead of float64 or users with old state files will panic during upgrades. Note: the backwards compatibility properties of this commit are unknown at this time, and there may be a way for us to avoid using schema.Resource.UseJSONNumber, instead preserving the choice until we get to the point where TypeInt casts the number to an int. Further investigation needed.
@marekaf as mentioned above, this doesn't seem to be limited to 0.14 and seems to be affecting 0.13 and 0.12 since it is provider side. You could try to downgrade your provider (not the CLI), and see if older versions can handle this, they would need to be on the v1 SDK, as opposed to the v2 SDK I think, and I'm not sure where that switched in all providers. 15-17 integer digits is about the most you can safely carry in a float64 without any potential loss and this ID has crossed that threshold so it will occasionally have losses until the SDK fix is merged in to the provider or the typing is changed. |
thanks! I managed to work around it by just making sure the apply will not touch url_map (not possible via -targeting so I had to simply change manually in GCP web console everything to match the config in terraform hcl so that not drift would be detected). Everything else works for plan & apply. |
Go's JSON decoder, by default, uses a lossy conversion of JSON integers to float64s. For sufficiently large integers, this yields a loss of precision, and that causes problems with diffs and plans not matching. See #655 for more details. This PR proposes a solution: keeping the lossless json.Number representation of integers in state files. This will ensure that no precision is lost, but it comes at a cost. json.Number is a string type, not a float64 type. That means that existing state upgraders that (correctly) cast a value to float64 will break, as the value will now be surfaced to them as a json.Number, which cannot be cast to a float64. To handle this, the schema.Resource type gains a `UseJSONNumber` property. When set to new, users are opted into the new json.Number values. When left false, the default, users get the existing float64 behavior. If a resource with state upgraders wants to use the new json.Number behavior, it must update all the state upgraders for that resource to use json.Number instead of float64 or users with old state files will panic during upgrades. Note: the backwards compatibility properties of this commit are unknown at this time, and there may be a way for us to avoid using schema.Resource.UseJSONNumber, instead preserving the choice until we get to the point where TypeInt casts the number to an int. Further investigation needed.
This should be resolved in 2.4.0 that is rolling out now. Resources impacted will need to set |
I'm having the same issue with google_compute_url_map resource since I upgraded to terraform v0.14.0
I also tried with terraform v0.14.3, it's not working also. this action (changing the backend) was working perfectly with terraform version 0.13.5 |
Prior to this change, integers in state that do not fit cleanly in float64s lose their precision, leading to permadiffs. As of SDK v2.4.0, setting UseJSONNumber on the resource schema changes number representation to json.Number throughout, preserving precision. Please see hashicorp/terraform-plugin-sdk#655 for more information.
I've updated the title of this issue to prevent confusion when referring to it in the future, as this is not a 0.14 bug. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Terraform Version
Terraform Configuration Files
Nothing here is particularly important, only that
google_compute_url_map
stores a large value in a TypeInt. This value is Computed and is not changing between calls ofterraform plan
Debug Output
Key line:
This turns into an error at apply-time
Crash Output
Expected Behavior
TypeInt value is read from state without rounding
Actual Behavior
TypeInt value is read from state with rounding
Value in state:
"map_id": 7227701560655103598,
Value returned from d.Get("map_id"):
7227701560655104000
Steps to Reproduce
terraform apply
terraform plan
Shows the error in debug logs. Changing the description of the URL map causing an update and then running
terraform apply
causes Terraform to error with:Additional Context
I have tracked this down to a change in state between two runs of
terraform apply
. The value is set correctly in state and is written to the state file correctly, but seems to round when the value is read by the next invocation ofterraform plan/apply
This behavior was introduced in 0.14.0, as these resources did not have a problem in the past.
References
The text was updated successfully, but these errors were encountered: