Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Execution Environments #274

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

RFC: Execution Environments #274

wants to merge 7 commits into from

Conversation

hone
Copy link
Member

@hone hone commented Feb 1, 2023

@hone hone requested a review from a team February 1, 2023 08:10
@buildpack-bot
Copy link
Member

Maintainers,

As you review this RFC please queue up issues to be created using the following commands:

/queue-issue <repo> "<title>" [labels]...
/unqueue-issue <uid>

Issues

(none)

Signed-off-by: Terence Lee <[email protected]>

In order to support additional execution environments an `exec-env` key will be added to various TOML tables in the project. The value can be any string with `all` having special meaning. `all` will apply to all execution environments and will be the default if not specified. This should make it backwards compatible and optional. When `exec-env` is not set to `all`, the table settings will only be applied to that execution environment.

### Project Descriptor - `project.toml` (App Developers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an example of a full project.toml that is used for producing a test image and a production image?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for Project Descriptor?

@sambhav
Copy link
Member

sambhav commented Feb 1, 2023

I'm having a difficulty trying to grasp how environment is being used by different toml files. @hone could you please provide examples of places where you imagine it being used and how that information will be leveraged by buildpacks/platform?

hone added 2 commits February 1, 2023 15:04
* test
* development

### Buildpack API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we perhaps get a conceptual example of what a buildpack author might do with this new execution environment information?

I have in my head a language buildpack author may...

  • install test group dependencies
  • create ruby-tests process / ruby-tests-verbose process
  • Skip cleaning up things they might clean up otherwise for production images
  • Set env vars to test mode operation (RAILS_ENV or similar)

@loewenstein
Copy link
Contributor

Frankly, I don't get this proposal aligned with my mental model of what buildpacks do and how they help.

For me, container images are just another software artefact nowadays produced to ship software. Like a while back one was to produce JAR and/or WAR files that got delivered and finally deployed into some execution environment.
Buildpacks are a great tool to create those, but I don't see how or why I would create a test container that was separate from the production ones. I would want my tests to validate exactly the artefact that I am about to deliver. Otherwise, how do I know I checked the right thing.

From my POV buildpacks should focus on that task, creating the container image, and leave the other CI tasks to specialized tools.

Please help me to adjust my mental model, why's this added complexity in the buildpack spec worth it?

@jabrown85
Copy link
Contributor

Please help me to adjust my mental model, why's this added complexity in the buildpack spec worth it?

I'm not the author but I can speak for my own mental model.

One use case is from the app developer side. Today they can pack build <img> to get a production artifact.

If we imagine the resulting image is ruby + nodejs, the developer has to setup both of those environments locally to develop and AGAIN on say GitHub Actions. The way those installations happen can differ. The GitHub action may install a different minor version of node or ruby to run the tests for instance.

Setting up CI with buildpacks could make this experience easier. pack test would use the same buildpacks and therefore test a more prod-like experience as the buildpack would install the same versions. This is especially important if your buildpack has options like DO_THIS_GARBAGE_COLLECTION_SETTING that you would have to know and replicate in your CI environment.

Another future-proofing thing here may be around future ARM support. Building and running an ARM test image via pack test seems tractable.

# Unresolved Questions
[unresolved-questions]: #unresolved-questions

- "env" is overloaded as a word since we also use it for environment variables. Is there a better word here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "mode"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like "context", but that's just as overloaded. How about "purpose = test" or "intent = test"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "env" was overloaded before we got here, and it's ok to use.

With the test OCI Image, a platform can execute the tests in the pipeline as they see fit. This means a bulk of the responsibilities are platform concerns:

- Set which environment to build for
- Decide which buildpacks to execute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a lifecycle concern? i.e., when processing a group within an order, should it skip buildpacks that do not declare exec-env matching the desired env?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When talking to @jabrown85, I thought the Platform/Builder provide the order.toml to lifecycle?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From WG, lifecycle will decide which buildpacks to run based on the execution environment being passed along by the platform.

- How to execute the tests
- What is the test result format like [TAP](https://en.wikipedia.org/wiki/Test_Anything_Protocol)?
- How to process the test results
- What to do with the results
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever something we would want to spec? Having a consistent way for buildpacks to e.g., dump test output could help ensure portability across platforms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see us wanting to do that, but I wasn't sure how much we wanted to impose standards there.

- Should the execution environments be an enum or flexible as a string?
- enums will help encourage standardization across buildpacks and platforms.
- strings can help account for use cases we haven't thought of yet.
- Should buildpacks be allowed specify allowlist execution environments?
Copy link
Member

@natalieparellano natalieparellano Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any downsides to doing this? It would be more flexible and possibly avoid some duplication within orders (we could keep all as a special value)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only downside is what if a user wants to override in a way it wasn't intended? Is there a reason to lock stuff out?

## Development Environments
The specifics of creating development enviroments are out of scope of this RFC, but it's not hard to extrapolate how these kind of changes can assist in creating Buildpacks for development environments.

# How it Works
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to see the pack flow as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add anything to the app image to designate it as being built for a particular environment? To avoid users accidentally deploying a test image in production...

I could see folks wanting to use the same tag when re-building a test image for production, in order to use previously cached dependencies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natalieparellano that's a good question. Do you think cached dependencies should be shared b/t different execution environments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a label the best place to designate the execution environment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it would be great if I could pack build --exec-env test image_foo and get image_foo:test automatically. If I then pack build --exec-env production image_foo, I'd get image_foo:production. If I want to change that, I could supply a tag and then pack should just use the tag I set, like pack build --exec-env test image_foo:my-test.

I would also, in a perfect world, like to see cache shared across different execution environments. It is extremely like that I'm using the same versions of Java, Node, etc... in both test & production so it would be a negative if we can't cache that and only download the language runtime once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sharing cache, is it a problem if I run a test, get dev dependencies but my production build doesn't want them and prunes that cache. Next time I run a test I would need to install them. Do you just want to share cache for bootstrapping for every build?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel like that will be an issue for things like the JVM or Node.js runtime, but it sounds like it would be for other layers, like node modules or other things installed via package manager that would change between environments. That would certainly be annoying and not expected as a user.

Caching is per layer, what if the buildpack were to create different layers for cases like this? Maybe incorporate the exec env key into the layer name or something to make it unique? It wouldn't need to do this everywhere, but for things that are env dependent, it could have npm_modules-test and npm_modules-production layers.

If you ran a build for test and then a build for production, would that result in keeping both of those cached? or would it evict the test layer when you run the production build because the production image didn't use that layer?

@kanaksinghal
Copy link

I have a use-case for this.
In our company we use a variety of languages/stack and I wanted to write a generic CI pipeline where buildpacks can help us build the container image by detecting the stack but then to run unit test, I need to detect the stack again myself and run specific commands for nodejs/go/java etc.
In a typical CI pipeline we would do unit-test, run sonar quality checks, and build container image.

@jama22
Copy link

jama22 commented Feb 9, 2023

Really excited for this proposal, and I think there's a lot of interesting ways to make use of it.

In the context of a self-hosted PaaS
I've seen a common dynamic where the "operator" needs to establish fine grained controls over what the final containers look like. These configurations are also prone to change as the artifact itself moves into different environments (service connectors, certificates, environment variables, access to secrets managers, etc.)

On the developer side, most folks don't ever see production-like environments, so being able to debug their applications in production-like environments becomes incredibly difficult. In the past, I've seen some of these per-environment configurations being managed through GitOps and config files. This gives the user some control over managing the movement of the workload through the system, but even still they may not be able to build that container and interact with it directly

Moving the environment configuration opens up a lot of possibilities in this use case. I could image an operator encoding their per-environment configurations into the builder, and giving the developer the ability to simulate how their application will change in behavior across configurations.

For CI/CD
I think there have been some interesting ideas already discussed above and in the proposal. But I think a common use case that i can come up with is if you're building a library with buildpacks, you'll want to test on multiple OS-distros and architecture combinations (e.g. x86 vs win vs arm, ubuntu vs debian vs rhel). Those env configurations are commonly configured in CI itself or possibly externalized to some other config. Moving them back into the buildpack may provide for an improved UX

Changing what "prod-like" means
I think there's a pretty reasonable use case here where operators may want to give developers the ability to change what "prod-like" means. For example, the GCP buildpacks has GOOGLE_DEVMODE that can be enabled for faster changes. I can see it being used as a developer-oriented configuration while maintaining guardrails in prod-like envs


There were some flaws in this design. Though it's clean to separate production and test code paths, they end up sharing a lot of code. Many of the bash based Heroku buildpacks would just [call `bin/compile`](https://github.com/heroku/heroku-buildpack-nodejs/blob/main/bin/test-compile#L24) with different parameters/env vars.

## [GOOGLE_DEVMODE](https://cloud.google.com/docs/buildpacks/service-specific-configs#google_devmode)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to devmode, I just ran across live reloading in the Paketo python buildpacks https://paketo.io/docs/howto/python/#using-bp_live_reload_enabled

hone and others added 2 commits April 12, 2023 13:37
Co-authored-by: Natalie Arellano <[email protected]>
Signed-off-by: Terence Lee <[email protected]>
Co-authored-by: Josh W Lewis <[email protected]>
Signed-off-by: Terence Lee <[email protected]>
@cz4rny
Copy link

cz4rny commented Jul 11, 2023

Two questions:

  1. How would one extract artifacts from the test env? Running unit tests, or any other static-analysis tools produces outputs separate from the end OCI image. Currently pack performs the build in a tmp directory, which is cleaned out at the end. How would one access the results of all the checks done in the dev env?

  2. How would I ensure, that the code produced in another pack built, to the production environment is tested? Having the CNB_EXEC_ENV set could introduce a completely different set of layers and hence a different output. So I would test the source code and the binary produced from the test environment, but that doesn't mean I've tested the exact same binary, source code, and artifacts produced in the production env.

@jabrown85
Copy link
Contributor

Two questions:

  1. How would one extract artifacts from the test env? Running unit tests, or any other static-analysis tools produces outputs separate from the end OCI image. Currently pack performs the build in a tmp directory, which is cleaned out at the end. How would one access the results of all the checks done in the dev env?

This RFC aims to only produce a test image artifact. A test platform would then execute the image and collect the results (TAP/json/etc).

  1. How would I ensure, that the code produced in another pack built, to the production environment is tested? Having the CNB_EXEC_ENV set could introduce a completely different set of layers and hence a different output. So I would test the source code and the binary produced from the test environment, but that doesn't mean I've tested the exact same binary, source code, and artifacts produced in the production env.

There would be no such guarantee. Production images vary quite a bit from stack to stack and there is no one size fits all solution for testing. For a practical example, a ruby or node app will often have test environment specific dependencies that should not make their way into the final production images. Another example is a simple go app on a scratch/tiny stack. The go tooling itself, go test, should not be packaged into the final production image.

Thinking out loud, a test pipeline could grab the SBOM/digests/shasum of the things it cares about and compare them to the production image that was built.

@cz4rny
Copy link

cz4rny commented Jul 13, 2023

Right, so right now the image produced is a self-executing app built, whether it's npm, or go or whatever.

The test image would not execute anything but would contain the app with all of the, keeping to the npm example, dev dependencies. Maybe it's even built. And the pipeline would use that image, run its tools on top of that (static analysis, test, etc.), grab the produced outputs, and publish them.

@jabrown85
Copy link
Contributor

The way I understand this is the produced image would have test processes contributed by buildpacks.

A golang buildpack would create a layer that contains go testing tools and maybe even run a go build ./... during the build phase so you can fail early. The go buildpack could contribute test process to launch.toml that runs go test ./.... That would be the default process launched when the image is executed in a testing pipeline. As you said, the pipeline could also choose to run static analysis as well as any other process on the image (maybe lint in this example) and capture the results.

@keskad
Copy link

keskad commented Oct 6, 2023

Hi,

I wanted to comment the concept with a use-case. So, I don't see the point in creating a production images in environment X, while testing in environment Y. That's not consistent way of doing CD pipelines (current behavior) 🙂

I give buildpacks to teams, they set e.g. BP_NODE_VERSION 16.1 and it will execute with Node 16.1, while in tests it will still execute with hardcoded Node 16.0 because I have to maintain additional images. Using buildpacks in this scenario is loosing sense for me 🙂 More consistent way is to just use the Dockerfile with base image used in tests, because it has consistent Maven and Java version inside.

Due to nature of buildpacks - preparing the environment (by installing required tooling, setting up those tools) - I think the test phase is even a neccesary scenario to implement a valid CD pipeline in order to gain reproducible environment everytime. The tests needs to be running on something that is CLOSE TO PRODUCTION 🙂

So I see that build phase even as neccessary to implement a valid CD pipeline, so I like this proposal.

The case is similar to the principle, where we should not build artifact twice if promoting to next environment e.g. dev -> prod, but just reuse the artifact. There we should use the same build parameters, same tooling.

Its difficult to maitain e.g. Node 14, Node 15, Java 8, Java 11, and 15 other base images just for tests, while giving teams a possibility to set a Node or Java version in buildpacks.

A testing phase in my opinion would benefit in:

  • not maintaing the base images internally just for running tests (in bigger organizations its a huge effort)
  • have a consistent way of setting up both test & prod tooling (versions and parameters - e.g. team sets Node v16.1, then it is working in both test & prod)
  • forget about testing command for each project (buildpack test phase would know a testing command for each technology just as it knows how to run the build)

@hone hone force-pushed the execution-environments branch from 9d72744 to af6ee3f Compare January 3, 2025 19:21
* changed `exec-env` key to an array
* clarified how processes will work
* added an open question around reserved string values being namespaced
* added some usage clarifications

Signed-off-by: Terence Lee <[email protected]>
@hone hone force-pushed the execution-environments branch from af6ee3f to 992d873 Compare January 3, 2025 19:25
Signed-off-by: Terence Lee <[email protected]>
- enums will help encourage standardization across buildpacks and platforms.
- strings can help account for use cases we haven't thought of yet.
- Should buildpacks be allowed specify allowlist execution environments?
- What changes are needed in the buildpack registry?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users will probably want a way to tell if a buildpack supports execution environments and if so, which ones. Maybe searching by that too. i.e. filter by buildpacks that support execution environments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkutner should we make changes to the registry index to account for this somehow? Add it into buildpack.toml?

## Setting the Execution Enviroment for Build + Buildpack
A platform will set the `CNB_EXEC_ENV` env var to the execution environment desired. Buildpacks can than read this env var to branch or switch on logic needed based on the execution environment.

In addition, Builder Authors, Buildpack Authors, and App Developers will be able to specify various options to a specific execution enviroment using the `exec-env` key.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In addition, Builder Authors, Buildpack Authors, and App Developers will be able to specify various options to a specific execution enviroment using the `exec-env` key.
In addition, Builder Authors, Buildpack Authors, and App Developers will be able to specify various options to a specific execution enviroment using the `exec-env` key in a `project.toml` file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are now a few other files like builder.toml and launch.toml.


### Buildpack API

A buildpack author will be able to determine the execution environment their buildpack is expected to build for by reading the `CNB_EXEC_ENV` environment variable. If this value is not set, a Buildpack Author can assume it's set to `production`. This will be provided for both `bin/detect` and `bin/build`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of defaulting to production, should we have a concept of all? Maybe this is similar to the exec-env having *?

Copy link
Member Author

@hone hone Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkutner do we need a concept of "all" for an individual lifecycle execution? In my head I liked the simplicity for a buildpack execution to know a single exec env you were targeting. I'm not sure how this would work in the current RFC. For instance, if I'm a buildpack author doing test execution environment I would install with test dependencies. If I'm building for production, I would likely prune those for that image. In order for us to make this work, we would need to have exec-env configuration in layer.toml then? How would lifecycle know what layers to export? If we're ok with a "fat" image that launcher would "construct" at runtime given a CNB_EXEC_ENV, we could make it work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking with @jabrown85 , I think compiled languages like Go might want to remove the source tree from /workspace for production but not test which would extra complicate things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.