Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework HCP fingerprint generation #12172

Merged
merged 2 commits into from
Jan 25, 2023
Merged

Conversation

lbajolet-hashicorp
Copy link
Contributor

@lbajolet-hashicorp lbajolet-hashicorp commented Dec 20, 2022

Fingerprints are how we link a packer build to an iteration on HCP. These are computed automatically from the Git SHA in the current state, and are unique to the bucket/iteration.

The main problem with this approach is that while sound in theory, it quickly falls apart when users want to run the same build configuration twice, but expect a new image to be created.

With the current model, this fails, as the iteration with the current SHA already exists.

While this is solvable through environment variables, or by committing a change to the repository, we think this is not clear enough, and causes an extra step to what should otherwise be a simple process.

Therefore, to lower the barrier of entry into HCP, we change this behaviour with this commit.

Now, fingerprints are randomly generated ULIDs instead of a git SHA, and a new one is always generated, unless one is already specified in the environment.

This makes continuation of an existing iteration a conscious choice rather than something automatic, and virtually eliminates conflicts such as the ones described above.

CHANGELOG.md Show resolved Hide resolved
t.Cleanup(func() {
//nolint:errcheck
os.RemoveAll(tempdir("4ec004e18e"))
})
},
},
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

internal/hcp/registry/types.iterations.go Outdated Show resolved Hide resolved
internal/hcp/registry/types.builds.go Outdated Show resolved Hide resolved
@@ -80,3 +89,23 @@ func (h *JSONMetadataRegistry) CompleteBuild(
) ([]sdkpacker.Artifact, error) {
return h.bucket.completeBuild(ctx, build.Name(), artifacts, buildErr)
}

func (h *JSONMetadataRegistry) IterationStatusSummary(ui sdkpacker.Ui) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!! 🎉

internal/hcp/registry/hcl.go Outdated Show resolved Hide resolved
@lbajolet-hashicorp lbajolet-hashicorp force-pushed the hcp_fingerprint_redo branch 2 times, most recently from 96da68b to d0af4ab Compare December 21, 2022 16:07
Copy link
Contributor

@nywilken nywilken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is off to a good start. It is nice to see how little rework was needed to make this happen.

I left a few suggestions and some comments from my testing so far. Let me know what you think.

internal/hcp/registry/hcl.go Outdated Show resolved Hide resolved
@@ -109,3 +116,35 @@ func withPackerEnvConfiguration(bucket *Bucket) hcl.Diagnostics {

return nil
}

// getGitSHA returns the HEAD commit for some template dir defined in opt.TemplateBaseDir.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// getGitSHA returns the HEAD commit for some template dir defined in opt.TemplateBaseDir.
// getGitSHA returns the HEAD commit for some template dir defined in baseDir.

command/build.go Outdated
@@ -378,6 +378,8 @@ func (c *BuildCommand) RunContext(buildCtx context.Context, cla *BuildArgs) int
ret = 1
}

hcpRegistry.IterationStatusSummary(c.Ui)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice summary function. I think the output can benefit from a little spacing after the build artifact details.

~>  packer build -only='file2' ../simple.json
file.file2: output will be in this color.

Build 'file.file2' finished after 156 milliseconds 996 microseconds.

==> Wait completed after 443 milliseconds 780 microseconds

==> Builds finished. The artifacts of successful builds are:
--> file.file2: Stored file: /tmp/dummy_artifact
--> file.file2: Published metadata to HCP Packer registry packer/hcppacker-test/iterations/01GMTX0FF36753GY451EE44QHW
Iteration "01GMTX0DSP3CD81ESQ859ESQJ6" is not complete, the following builds are not done:
* "file": UNSET
You may resume work on this iteration in further Packer builds by defining the following variable in your environment:
HCP_PACKER_BUILD_FINGERPRINT="01GMTX0DSP3CD81ESQ859ESQJ6"```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that we should display the generated fingerprint at the beginning of a build as well so that the user doesn't have to wait until the end of a build to get the iteration summary. What do you think?

Maybe having something like

This build is being tracked on HCP Packer with the Fingerprint .... 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran a quick test where I had a long running build that I cancelled. The iteration summary does not display, as expected.

~>  HCP_PACKER_BUILD_FINGERPRINT="01GMTY7BNZVT7V24NXA29C2HRM" packer build googlecompute-centos-shell.json
googlecompute: output will be in this color.

==> googlecompute: Checking image does not exist...
==> googlecompute: Creating temporary RSA SSH key for instance...
==> googlecompute: Importing SSH public key for OSLogin...
Cancelling build after receiving interrupt
==> googlecompute: Error obtaining token information needed for OSLogin: context canceled
==> googlecompute: Deleting SSH public key for OSLogin...
==> googlecompute: Error deleting SSH public key for OSLogin. Please delete it manually.
==> googlecompute:
==> googlecompute: Error: googleapi: Error 401: End user credentials not provided.
Error: failed to complete HCP-enabled build "googlecompute"

build failed, not uploading artifacts


Build 'googlecompute' errored after 6 seconds 853 milliseconds: Error obtaining token information needed for OSLogin: context canceled

==> Wait completed after 6 seconds 853 milliseconds
Cleanly cancelled builds after being interrupted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call for the cancellation, I did forget that one. Should be an extra acceptance test for this too.
With the current implementation, this could be invoked through defer so we always have that invoked.
I'll reroll this later today

Copy link
Contributor

@nywilken nywilken Dec 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defer so we always have that invoked.

I believe if a builder panics due to some runtime issue that the defer will not execute. This might be a little hard to test but it is worth testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be surprised, builders run in a separate process, so if they hard crash, it'll end-up with the build failing.
If Packer crashes however it won't be called for sure, but in this case there's not much we can do regardless

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that we should display the generated fingerprint at the beginning of a build as well so that the user doesn't have to wait until the end of a build to get the iteration summary. What do you think?

The idea is sound, to do so however we would need to pass the Ui through to other functions, so at that point I think it may be better to pass it at creation-time, and reuse it when necessary.
What do you think? I think it may make sense, and it'd let us drop it from the function we just added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I came back to this PR today, I opted to set the UI as part of the structures for HCL2/JSON, so now we print a message stating on which iteration the build is being tracked, as per your suggestion @nywilken

internal/hcp/registry/types.iterations.go Outdated Show resolved Hide resolved
@nywilken
Copy link
Contributor

nywilken commented Dec 21, 2022

I know this is still in progress but I when the time comes we need to update the documentation around fingerprinting on https://developer.hashicorp.com/packer/docs/hcp and on https://developer.hashicorp.com/hcp/docs/packer/store-image-metadata/packer-template-configuration#build-fingerprint

Mentioning to remind ourselves post holiday 😄

return todo
}

func (i *Iteration) iterationStatusSummary(ui sdkpacker.Ui) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

The iteration structure that we use for linking a packer build to an
iteration on HCP defines a `Labels' attribute, which is never set nor
read from at any point.

Since it is unused, we remove it in this commit.
@@ -29,7 +29,7 @@ You must set the following environment variables to enable Packer to push metada

You can set these additional environment variables to control how metadata is pushed to the registry.

- `HCP_PACKER_BUILD_FINGERPRINT` - A unique identifier assigned to each build. HCP Packer uses this identifier to determine if metadata for a build on the registry is complete. By default, HCP Packer uses the HEAD Git SHA for the template file. If the template is not version controlled, you must set this environment variable to a unique value before running `packer build`.
- `HCP_PACKER_BUILD_FINGERPRINT` - A unique identifier assigned to each build. HCP Packer uses this identifier to determine if metadata for a build on the registry is complete. By default, Packer will generate a new unique fingerprint for the template file on each new build. If you want to continue building an iteration, you will need to either set a unique fingerprint as HCP_PACKER_BUILD_FINGERPRINT before starting the build or you may use the generated one on subsequent builds.
Copy link
Contributor

@sylviamoss sylviamoss Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to continue building an iteration, you will need to either set a unique fingerprint as HCP_PACKER_BUILD_FINGERPRINT before starting the build or you may use the generated one on subsequent builds.

I am finding this part a little confusing because you may use the generated one on subsequent builds. is also setting HCP_PACKER_BUILD_FINGERPRINT, right?

Suggested change
- `HCP_PACKER_BUILD_FINGERPRINT` - A unique identifier assigned to each build. HCP Packer uses this identifier to determine if metadata for a build on the registry is complete. By default, Packer will generate a new unique fingerprint for the template file on each new build. If you want to continue building an iteration, you will need to either set a unique fingerprint as HCP_PACKER_BUILD_FINGERPRINT before starting the build or you may use the generated one on subsequent builds.
- `HCP_PACKER_BUILD_FINGERPRINT` - A unique identifier assigned to each build. HCP Packer uses this identifier to determine if metadata for a build on the registry is complete. By default, Packer will generate a new unique fingerprint for the template file on each new build. If you want to continue building an iteration, you will need to set HCP_PACKER_BUILD_FINGERPRINT with the iteration's fingerprint before starting the build.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah now that you mention it I tried to address two problems at once here, and I agree it can be confusing. Re-reading the comment there's a few more things that warrant a rephrasing here, I'll change that.
To your point, the difference between the two alternatives are:

  1. you choose a unique pattern for your iteration before first building your template (i.e. first invocation of packer build)
  2. you let Packer generate one for you, and will have to set it before running another build if you want to continue working on it (if there was an error or a --only/--except clause anywhere)

To avoid overcharging this bullet point, I'll probably add a paragraph for managing the fingerprint, so we can detail the alternatives and in which case you may want one over the other.
I also wonder if we should add this to this page or if we should create another one. I'll add this to that page for now, and we can discuss if that'd be better elsewhere.

Copy link
Contributor

@nywilken nywilken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm spent a little time looking through the documentation changes and logic for iteration summary. I left a few suggestions. I will put this down and run it against a few test cases.

website/content/docs/hcp/index.mdx Outdated Show resolved Hide resolved
website/content/docs/hcp/index.mdx Outdated Show resolved Hide resolved
website/content/docs/hcp/index.mdx Outdated Show resolved Hide resolved
website/content/docs/hcp/index.mdx Outdated Show resolved Hide resolved
@@ -90,12 +90,14 @@ func (c *BuildCommand) RunContext(buildCtx context.Context, cla *BuildArgs) int
return ret
}

hcpRegistry, diags := registry.New(packerStarter)
hcpRegistry, diags := registry.New(packerStarter, c.Ui)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing as the the Ui is only used for the IterationStatusSummary have you considered just passing in the Ui directly into defer hcpRegistry.IterationStatusSummary() and removing the need for persisting it on the registry type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was the initial implementation, I decided to add Ui as part of the state for the registry so we can use it to print messages anytime we want after a previous feedback. We can go back to this implementation, the current one uses Ui in two places, once in the IterationStatusSummary, and during the initialisation for HCL2/JSON, where it mentions the fingerprint we are tracking the iteration on.

The second log was not there initially and was added after a feedback pass, but I like the idea of having it here, since in case of a crash you will still have the fingerprint available to continue building upon should you want to.

@@ -127,9 +141,12 @@ func NewHCLMetadataRegistry(config *hcl2template.PackerConfig) (*HCLMetadataRegi
bucket.RegisterBuildForComponent(source.String())
}

ui.Say(fmt.Sprintf("Tracking build on HCP Packer with fingerprint %q", bucket.Iteration.Fingerprint))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there another way to reuse the content of the IterationStatusSummary without having to copy it into this call. The messaging below is helpful.

You may resume work on this iteration in further Packer builds by defining the following variable in your environment:
HCP_PACKER_BUILD_FINGERPRINT="generated-fingerprint"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The messaging you reference is useful when the iteration fails, but I'm not sure we should reuse the same phrasing for notifying users of which iteration we are tracking their build on at the beginning of the build.
I'm not sure it'd be worth it to split the IterationStatusSummary function to factorise such a small amount of code tbh, I'd be inclined to leave it as-is, unless you have a better idea?

Copy link
Contributor

@nywilken nywilken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated documentation looks good. Thanks for making the updates. I would like to merge and get this into the latest nightly to test further.

website/content/docs/hcp/index.mdx Outdated Show resolved Hide resolved
Fingerprints are how we link a packer build to an iteration on HCP.
These are computed automatically from the Git SHA in the current state,
and are unique to the bucket/iteration.

The main problem with this approach is that while sound in theory, it
quickly falls apart when users want to run the same build configuration
twice, but expect a new image to be created.

With the current model, this fails, as the iteration with the current
SHA already exists.

While this is solvable through environment variables, or by committing a
change to the repository, we think this is not clear enough, and causes
an extra step to what should otherwise be a simple process.

Therefore, to lower the barrier of entry into HCP, we change this
behaviour with this commit.

Now, fingerprints are randomly generated ULIDs instead of a git SHA, and
a new one is always generated, unless one is already specified in the
environment.

This makes continuation of an existing iteration a conscious choice
rather than something automatic, and virtually eliminates conflicts such
as the ones described above.
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants