Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

startup_script is 64encoded again if passed as base64 #168

Closed
DavidGOrtega opened this issue Jul 26, 2021 · 10 comments
Closed

startup_script is 64encoded again if passed as base64 #168

DavidGOrtega opened this issue Jul 26, 2021 · 10 comments
Assignees
Labels
duplicate Déjà lu

Comments

@DavidGOrtega
Copy link
Contributor

If you pass a base64 startup script its encoded again

@DavidGOrtega DavidGOrtega added bug Something isn't working p0-critical Max priority (ASAP) labels Jul 26, 2021
@0x2b3bfa0
Copy link
Member

See also #91?

@DavidGOrtega
Copy link
Contributor Author

If #91 is not merge nor fixed... why is it happening 🤔 ?

@0x2b3bfa0
Copy link
Member

This should not happen, at least not for resource_cml_runner

script, err := base64.StdEncoding.DecodeString(d.Get("startup_script").(string))

Can you provide additional context and a minimal example?

@0x2b3bfa0 0x2b3bfa0 self-assigned this Jul 26, 2021
@casperdcl casperdcl added p1-important High priority and removed p0-critical Max priority (ASAP) labels Aug 3, 2021
@pmrowla
Copy link

pmrowla commented Sep 2, 2021

This issue still affects resource_machine and makes it impossible to specify a startup script if you are using JSON (.tf.json) instead of HCL config files. The heredoc syntax has to be b64 encoded before writing to main.tf.json (to avoid having any invalid characters in the JSON string), and then it gets doubly encoded again by TPI.

In the short term we can work around this on the DVC side by using HCL for now, but this will eventually be a blocker for us.

@pmrowla
Copy link

pmrowla commented Sep 2, 2021

given this main.tf.json:

{
  "//": "This file auto-generated by dvc-tpi, do not edit manually.",
  "terraform": {
    "required_providers": {
      "iterative": {
        "source": "iterative/iterative"
      }
    }
  },
  "provider": {
    "iterative": {}
  },
  "resource": {
    "iterative_machine": {
      "peter-test": {
        "name": "peter-test",
        "cloud": "aws",
        "startup_script": "PDxFT0YKIyEvYmluL2Jhc2gKc3VkbyBhcHQtZ2V0IHVwZGF0ZQpzdWRvIGFwdC1nZXQgaW5zdGFsbCAtLXllcyBweXRob24zIHB5dGhvbjMtcGlwCnB5dGhvbjMgLW0gcGlwIGluc3RhbGwgLS11cGdyYWRlIHBpcAoKY2QgL2V0Yy9hcHQvc291cmNlcy5saXN0LmQKc3VkbyB3Z2V0IGh0dHBzOi8vZHZjLm9yZy9kZWIvZHZjLmxpc3QKc3VkbyBhcHQtZ2V0IHVwZGF0ZQpzdWRvIGFwdC1nZXQgaW5zdGFsbCAtLXllcyBkdmMKCkVPRg=="
      }
    }
  }
}

I would expect the machine to be created with this startup script:

<<EOF
#!/bin/bash
sudo apt-get update
sudo apt-get install --yes python3 python3-pip
python3 -m pip install --upgrade pip

cd /etc/apt/sources.list.d
sudo wget https://dvc.org/deb/dvc.list
sudo apt-get update
sudo apt-get install --yes dvc

EOF

I've tried using the b64 encoded version of the script both with and without the heredoc <<EOF...EOF, but as far as I can tell neither one works right now.

Using a main.tf (in HCL instead of JSON) with the heredoc script works as expected.

@0x2b3bfa0
Copy link
Member

0x2b3bfa0 commented Sep 3, 2021

This issue still affects resource_machine and makes it impossible to specify a startup script if you are using JSON (.tf.json) instead of HCL config files.

Thanks for reproducing the issue! Then, it only "affects" resource_machine and not resource_cml_runner; fixing #91 would be as simple as modifying the latter to behave as the former. 🎉

The heredoc syntax has to be b64 encoded before writing to main.tf.json (to avoid having any invalid characters in the JSON string), and then it gets doubly encoded again by TPI.

I don't think JSON Configuration Syntax supports heredoc strings at all. Ideally startup_script should be a plain JSON string, and escaping or Base64 encoding should be the responsibility of the user; that is, you. 😉

Most sane JSON serializers will automatically escape any valid UTF-8 string so you don't have to, but, if you still want to resort to Base64, it's as easy as using Terraform's inbuilt base64decode and templates to do the decoding in place for you.

Valid JSON script

{
  "startup_script": "#!/bin/bash\nsudo apt-get update\nsudo apt-get install --yes python3 python3-pip\npython3 -m pip install --upgrade pip\n\ncd /etc/apt/sources.list.d\nsudo wget https://dvc.org/deb/dvc.list\nsudo apt-get update\nsudo apt-get install --yes dvc"
}

Valid Base64 script

{
  "startup_script": "${base64decode(\"IyEvYmluL2Jhc2gKc3VkbyBhcHQtZ2V0IHVwZGF0ZQpzdWRvIGFwdC1nZXQgaW5zdGFsbCAtLXllcyBweXRob24zIHB5dGhvbjMtcGlwCnB5dGhvbjMgLW0gcGlwIGluc3RhbGwgLS11cGdyYWRlIHBpcAoKY2QgL2V0Yy9hcHQvc291cmNlcy5saXN0LmQKc3VkbyB3Z2V0IGh0dHBzOi8vZHZjLm9yZy9kZWIvZHZjLmxpc3QKc3VkbyBhcHQtZ2V0IHVwZGF0ZQpzdWRvIGFwdC1nZXQgaW5zdGFsbCAtLXllcyBkdmMK\")}"
}

Sample JSON encoder

import json

print(json.dumps({"startup_script": "#!/bin/bash ···"}))

Sample Base64 encoder

import json
import base64

def escape(string: str) -> str:
    encoded: str = base64.b64encode(string.encode()).decode()
    return f"""${{base64decode("{encoded}")}}"""

print(json.dumps({"startup_script": escape("#!/bin/bash ···")}))

@0x2b3bfa0
Copy link
Member

@casperdcl, how would this fit in iterative/tpi? I guess that you can get rid of HCL2 once and for all.

@pmrowla
Copy link

pmrowla commented Sep 4, 2021

Ok, that was my misunderstanding then. I was assuming that in JSON, the terraform provider needed startup_script to be b64 encoded since that is how it is ends up being encoded in the tfstate JSON. It was also unclear from looking at the terraform provider source, since resource_cml_runner does handle a b64 encoded startup_script (whether it comes from JSON or HCL), but resource_machine does not.

iterative/tpi already works as-is whether the user is using the HCL render_template or the JSON render_json to generate the terraform config. iterative/tpi just expects the user to pass their startup_script as a (multiline) string, and either dumps the string directly as JSON, or adds the heredoc wrapper in the templated HCL.

@0x2b3bfa0
Copy link
Member

since resource_cml_runner does handle a b64 encoded startup_script (whether it comes from JSON or HCL), but resource_machine does not.

Touché! Sorry for the inconsistent API, #91's purpose is exactly modifying resource_cml_runner to behave like resource_machine and take raw scripts.

iterative/tpi already works as-is whether the user is using the HCL render_template or the JSON render_json to generate the terraform config. iterative/tpi just expects the user to pass their startup_script as a (multiline) string, and either dumps the string directly as JSON, or adds the heredoc wrapper in the templated HCL.

Excellent! I overlooked render_json 🤦🏼

@0x2b3bfa0
Copy link
Member

Closing in favor of #91

@0x2b3bfa0 0x2b3bfa0 added duplicate Déjà lu and removed bug Something isn't working p1-important High priority labels Sep 4, 2021
@0x2b3bfa0 0x2b3bfa0 mentioned this issue Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate Déjà lu
Projects
None yet
Development

No branches or pull requests

4 participants