Use original commit version as output version after a merge #253

rliebz · 2019-04-10T20:49:06Z

Currently, if the merge parameter is set for out, the version ref used will be the result of the merge commit. I believe the version ref should instead be the original commit.

Consider my poorly drawn diagram of a commit history:

*  m - Merge commit of x and o
|\
* |  x - Commit we're racing against
| *  o - Good commit that we want
|/
* p - Parent commit

And my poorly drawn pipeline:

git-repo --> [ run-tests ] --> git-repo --> [ bump-version-and-tag ] --> git-repo-bumped --> [ deploy ]

In this case, I've made some changes in commit o as part of my pipeline, and I'd like to tag and push commit o. To prevent pipeline failures if some commit x ends up in my target branch before the pipeline is finished, I've set merge: true, which merges the latest target branch here, combining x and o into merge commit m:

git-resource/assets/out

Lines 177 to 192 in 9129dd9

    
           while true; do 
        
             echo "rebasing..." 
        
             git fetch push-target "refs/notes/*:refs/notes/*" 
        
             git pull --rebase=preserve push-target $branch 
        
             result="0" 
        
             push_with_result_check result 
        
             if [ "$result" = "0" ]; then 
        
               break 
        
             elif [ "$result" = "1" ]; then 
        
               exit 1 
        
             fi 
        
             echo "rebasing and trying again..." 
        
           done

And then ultimately uses the current commit, which would be commit m. This means the rest of my pipeline will now include commit x, despite it not being part of the relevant changeset:

git-resource/assets/out

Lines 199 to 202 in 9129dd9

    
           jq -n "{ 
        
             version: {ref: $(git rev-parse HEAD | jq -R .)}, 
        
             metadata: $(git_metadata) 
        
           }" >&3

Because merging creates a non-linear history where commit o will still be intact, I think commit o should be used as the version ref. This would mean future parts of the pipeline can operate on the original change set without picking up racing commits in the target branch, while still being able to output the change into the target branch without races.

Edit: If this is something you're interested in, I'd be happy to contribute the change.

The text was updated successfully, but these errors were encountered:

vito · 2019-05-14T20:27:32Z

🤔 I see your point but I have an alternative suggestion.

I think it should emit the merge commit, because there are times where that becomes the artifact you care about for the rest of the pipeline, and if it doesn't return that commit there's really no way to propagate it along. To me, conceptually the put has created a new commit, so it should return it (or, with concourse/rfcs#24, return all the commits that it pushed).

It seems like you have a use case for wanting to push + merge but then continue along with the original commit for the rest of the pipeline. And this isn't possible right now because the passed constraint for downstream jobs is selecting the merged version rather than the original version.

We're working on a new algorithm (concourse/concourse#3602) for v6.0, and along with this new algorithm we have new passed constraint semantics. In v6.0, the x in get: x, passed: [y] will refer to a named output from y, instead of just referring to a resource.

This means you could do something like this:

jobs:
- name: merger
  plan:
  - get: my-repo
  - put: my-repo-merged # give it a different name
    resource: my-repo   # ...but use the same resource
    params: {merge: true}

- name: downstream-job-using-original-commit
  plan:
  - get: my-repo
    passed: [merger]

- name: downstream-job-using-merged-commit
  plan:
  - get: my-repo-merged
    passed: [merger]

In the meantime, you could also simulate this sort of thing today by having a duplicate resource definition that you push your merge commits to:

resources:
- name: my-repo
  type: git
  source: &my-repo-source
    uri: ...
    private_key: ...

- name: my-repo-merged
  type: git
  source: *my-repo-source

jobs:
- name: merger
  plan:
  - get: my-repo
  - put: my-repo-merged
    params: {merge: true}

- name: downstream-job-using-original-commit
  plan:
  - get: my-repo
    passed: [merger]

- name: downstream-job-using-merged-commit
  plan:
  - get: my-repo-merged
    passed: [merger]

Does that solve your problem?

rliebz · 2019-05-15T15:07:43Z

Referring to named outputs that are not resources would solve the problem, probably in a more intuitive way overall, so 👍 for that in v6.0.

Unfortunately, I cannot get the desired behavior with the current system, because the changes that we're putting are coming from a task output rather than from the original resource:

jobs:
- name: pipeline
  plan:
  - get: my-repo
  - task: make-changes
    # outputs: [repo-changed]
    # ...
  - task: run-tests-and-stuff
    input_mapping: {repo: repo-changed}
    # ...
  - put: repo-merged
    params:
      repository: repo-changed

So in this case, there can be 3 different refs:

my-repo: The original commit we pulled
repo-changed: Whatever modifications we make during the pipeline
repo-merged: The merge commit, which may include our changes and other changes

I don't believe it is possible to refer to repo-changed in later jobs, which is the one we're interested in.

vito · 2019-05-15T15:20:21Z

FWIW, you also would not be able to do that in v6.0 - it won't be able to fetch task outputs, only things that were a get or a put - the only thing v6.0 enables is specifying the name of the get/put to fetch in situations where there are multiple occurrences of the same resource in one build plan.

Wouldn't you need to push repo-changed to a branch or something in order to propagate it along on its own?

rliebz · 2019-05-15T16:18:24Z

When a merge occurs, the ref of the pre-merge commit is also pushed as part of that, since a merge preserves the full history of commits. So that ref will be available on the remote git server, even though it won't be the most recent commit in that branch's history. That is to say, repo-merged must by definition contain repo-changed as part of its history. The same thing would not be possible with a rebase, though, since the original commit would be re-created with a new SHA.

The purpose of the merge for my use case is not to combine histories and leverage the combined history later on in the pipeline, but to ensure that the changes that were made as part of the pipeline make it to the upstream branch, even in the case of races.

Using the pre-merge commit lets a pipeline refer to the same version that we were interested in before pushing, because in the case of a merge commit, the version that ends up in the upstream branch is not the same version of the code that we started with.

So in an operation like this, where we pull down a repo and put it somewhere else:

jobs:
- name: get-and-put
  plan:
  - get: repo-one
  - put: repo-two
    params: 
      repository: repo-one
      merge: true

- name: get-merged
  plan:
  - get: repo-two
    passed: [get-and-put]

Getting repo-two would actually give us a merged combination of repo-one and repo-two's history. If this is all we do, we could just pull down repo-one instead, but if we wanted to do any work in between those two steps, there's no way to reference that intermediate step.

vito · 2019-05-15T18:01:37Z

Ah, got it, that makes sense. I can see why you would want to propagate a newly made commit along. Maybe this could just be accomplished via a param?

put: repo-two
params:
  repository: repo-one
  merge: true
  returning: (merged | unmerged)

This does seem like it would be useful for situations where the merged commit isn't really the interesting part, because - as you said - it would potentially include commits that have nothing to do with the commit under test. For example if you were to tag the vetted code with a version number or something you would probably want to tag the unmerged commit, not the merge, so it doesn't include code changes that aren't actually in the version.

rliebz · 2019-05-15T18:29:25Z

Having a param would solve for my use case and let us avoid making a backward incompatible change by preserving the original behavior when unspecified.

👍

Setting the `returning` param to `unmerged` allows using the original commit as the resource output when setting the `merge` param to `true`. Resolves concourse#253

Setting the `returning` param to `unmerged` allows using the original commit as the resource output when setting the `merge` param to `true`. Resolves concourse#253 Signed-off-by: Rob Liebowitz <[email protected]>

rliebz mentioned this issue May 21, 2019

Add "returning" option for merge output ref #262

Merged

vito closed this as completed in #262 May 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use original commit version as output version after a merge #253

Use original commit version as output version after a merge #253

rliebz commented Apr 10, 2019 •

edited

Loading

vito commented May 14, 2019

rliebz commented May 15, 2019

vito commented May 15, 2019

rliebz commented May 15, 2019

vito commented May 15, 2019

rliebz commented May 15, 2019

Use original commit version as output version after a merge #253

Use original commit version as output version after a merge #253

Comments

rliebz commented Apr 10, 2019 • edited Loading

vito commented May 14, 2019

rliebz commented May 15, 2019

vito commented May 15, 2019

rliebz commented May 15, 2019

vito commented May 15, 2019

rliebz commented May 15, 2019

rliebz commented Apr 10, 2019 •

edited

Loading