Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Change Proposal] Enforcing structure inside changelog.yml files #140

Open
ycombinator opened this issue Feb 24, 2021 · 22 comments
Open

[Change Proposal] Enforcing structure inside changelog.yml files #140

ycombinator opened this issue Feb 24, 2021 · 22 comments
Labels
discuss Issue needs discussion

Comments

@ycombinator
Copy link
Contributor

ycombinator commented Feb 24, 2021

Problem statement

With #131, packages are now expected to have a changelog.yml file in their root folders. That PR introduced the structure (syntax) for package changelogs but there are two unanswered questions about the semantics of the changelog entries:

  1. in the package's changelog.yml file, where should unreleased changes be stored?

  2. what should be the relationship of the version specified in a package's manifest.yml file to any of the versions specified in the package's changelog.yml file?

Proposal

The structure of package's changelog.yml files is expected to be like so:

- version: x.y.z
  changes:
  - description: ...
    type: ...
    link: ...
  - description: ...
    type: ...
    link: ...
- version: p.q.r
  changes:
  - description: ...
    type: ...
    link: ...
  - description: ...
    type: ...
    link: ...
- version: a.b.c
  changes:
  - description: ...
    type: ...
    link: ...
  - description: ...
    type: ...
    link: ...
  • We introduce a special version, next, that is expected to be first entry in the top-level array in changelog.yml files (where x.y.z is in the snippet above). All the changes under this version will be unreleased ones.
  • We expect that the second entry in the top-level array in changelog.yml files (where p.q.r is in the snippet above) matches exactly the value of version in the package's manifest.yml file.

This allows a package developer to accumulate unreleased changes in the changelog under the first entry (next). When the package developer is ready to release a new version of the package, they do the following, all as part of a single PR:

  1. bump up the value of version in their package's manifest.yml,
  2. replace next with the new value of version in their package's manifest.yml, and
  3. add a new next entry at the top of their package's changelog.yml file with an empty list of changes under it.
@ycombinator ycombinator added the discuss Issue needs discussion label Feb 24, 2021
@ycombinator
Copy link
Contributor Author

@mtojek @ruflin WDYT?

Anyone else we should pull into this discussion?

@mtojek
Copy link
Contributor

mtojek commented Feb 24, 2021

I think I talked with @masci about this. Massi suggested to leverage from semver and introduce here th "prerelease" tag. We can later tweak our scripts to check the relation between:

  • version in manifest and prerelease in changelog.yml
  • version in manifest and version in changelog.yml
  1. bump up the value of version in their package's manifest.yml,
  2. replace next with the new value of version in their package's manifest.yml, and
  3. add a new next entry at the top of their package's changelog.yml file with an empty list of changes under it.

I think we follow the same path with package-registry. I'm not sure if the step 3. is mandatory, but maybe it's better to enforce it.

@ycombinator
Copy link
Contributor Author

I think I talked with @masci about this. Massi suggested to leverage from semver and introduce here th "prerelease" tag.

I'm good with this idea but would like understand what it looks like in practice.

Could you come up with a snippet that shows a package's changelog.yml file using a prerelease tag? Assume that the package's latest released version is 2.4.1, so that is also the value of the version property in the package's manifest.yml file.

We can later tweak our scripts to check the relation between ...

Which scripts are you thinking of implementing these checks in? I was wondering if this is a good fit for #47. But if there's an easier/quicker way to implement this, I'm open to it.

@mtojek
Copy link
Contributor

mtojek commented Mar 8, 2021

Let me think loud -

The package is marked with 2.4.1-prerelase and there are few entries in the changelog.

The release operation (used with elastic-package publish patch/minor/major?) needs to be atomic and requires two commits:

  • release: 2.4.1 in manifest.yml#version and updated changelog
  • next prerelease: 2.4.2-prerelease in manifest.yml#version and recycled changelog

@ycombinator
Copy link
Contributor Author

ycombinator commented Mar 8, 2021

@mtojek Got it, thanks for the explanation. I'm ++ for implementing elastic-package release that performs the steps you outlined.

I do think we also need validation of CHANGELOG files for this part:

We expect that the second entry in the top-level array in changelog.yml files (where p.q.r is in the snippet above) matches exactly the value of version in the package's manifest.yml file.

... as nothing is requiring a developer to use elastic-package release to maintain theirchangelog.yml and manifest.yml#version. IOW, there might be teams/developers who are maintaining these files manually.

@ycombinator
Copy link
Contributor Author

The package is marked with 2.4.1-prerelase and there are few entries in the changelog.

This will also require that the integrations CI does not automatically try to publish packages with *-prerelease versions to package-storage#snapshot.

rw-access pushed a commit to rw-access/package-spec that referenced this issue Mar 23, 2021
* Update dependency on package-spec

* Fix: make check
@ycombinator
Copy link
Contributor Author

Related: #167.

@ycombinator
Copy link
Contributor Author

This issue is partially resolved via #172.

@mtojek the part that is left unresolved (for me) is whether we should have additional validation about the first version entry inside changelog.yml files. Should this entry be special (like next or something) to provide a space for package authors to accumulate changes for the next release of the package?

@mtojek
Copy link
Contributor

mtojek commented May 11, 2021

I found out that the comment # unreleased might be also misleading: https://github.com/elastic/integrations/pull/974/files#r629379784

I think the only cure for this is strict validation (including next).

@endorama
Copy link
Member

Hello, there have been some work in creating a dedicated tool to manage the changelog: https://github.com/elastic/elastic-agent-changelog-tool

I've been working on it for some time, and it could help resolve the mentioned issues, with the advantage of fitting better in the new semver based release model we are going to adopt. It has not been developed to target integrations (mainly Agent/Beats) but I think it's worth to discuss if we want to reuse it here too.

The general idea is to create a fragment (a file with some info on the introduced change) for each PR and collect them in a changelog at release, so the version a certain entry is part of is not determined upfront but results from releasing a certain change.
You may want to have a look at https://github.com/elastic/elastic-agent-changelog-tool/blob/main/docs/concepts.md and https://github.com/elastic/elastic-agent-changelog-tool/blob/main/docs/getting-started.md to get an idea on how it works.

@mtojek
Copy link
Contributor

mtojek commented Jul 28, 2022

The general idea is to create a fragment (a file with some info on the introduced change) for each PR and collect them in a changelog at release, so the version a certain entry is part of is not determined upfront but results from releasing a certain change.

What is the difference between fragments and the changelog.yml file?

@endorama
Copy link
Member

endorama commented Aug 1, 2022

Instead of having a single changelog, you have multiple independent fragments. This prevents the multiple consequent conflicts that happen when there is high activity on a package.

For example this is what recently happened to gcp package, where we are seeing a lot of activity from different devs (Security External integration and Cloud Monitoring team and an external contributor).

This is the snapshot of currently opened PRs:
Screenshot from 2022-08-01 15-14-29

Each of this PRs conflict with each other in 2 things:

  • changelog.yml
  • manifest.yml, field version

By using fragments this would not happen because (see this example):

  • each PR has it's own piece of changelog (the fragment), in a distinct unique file
  • the version is not updated at PR time but at release time

By postponing building the changelog and assigning the version at release time is possible to work in parallel without introducing conflicts only in the changelog declarations.

@mtojek
Copy link
Contributor

mtojek commented Aug 1, 2022

It's my personal view: I find editing the changelog inconvenient for Beats as for Integrations, but I don't see an issue in modifying a single changelog file by multiple contributors. It's a matter of skills to rebase and merge PRs.

It would be awesome if we don't need to update the changelog or fragments at all. For example, in the elastic-package project, we're using GoReleaser to generate it from commit messages. Personally, I find it really convenient as I don't have to care about it at all :)

Any thoughts about this @elastic/ecosystem?

@endorama
Copy link
Member

endorama commented Aug 1, 2022

It's a matter of skills to rebase and merge PRs.

Not really. Is not the technical difficulty that is a blocker here. It's a matter of each PR someone merge creates conflicts in others that target the same package. So each time someone has to rebase, wait for tests to pass again and merge. If these someone do not sync, multiple people can rebase their PR, trigger test runs and see their PR fail again with the same conflict, because someone else has been faster in merging their PR. With multiple contributions this issue delays merging, and adds up to the risk of having stale PRs (especially for external contributors).

It would be awesome if we don't need to update the changelog or fragments at all.

Indeed, but:

  • not every project uses go releaser or a standard release mechanism (i.e. integrations)
  • go release or other automated release mechanism are not particularly flexible (regexp matching means enforcing a common commit message format, that does not really play well with squash/merge pattern)

The fragment proposal is similar to how other OSS projects with high activity handle this (i.e. Ansible, Elasticsearch)

@mtojek
Copy link
Contributor

mtojek commented Aug 1, 2022

With multiple contributions this issue delays merging, and adds up to the risk of having stale PRs (especially for external contributors).

This is a valid concern and something we should tackle eventually. It's also about money burned on CI runs. I'm wondering if fragments can be an intermittent form of changelog, which is transformed into a built form (single changelog.yml) in the package building process. Of course, CI should acknowledge that this is a dynamic file.

The fragment proposal is similar to how other OSS projects with high activity handle this (i.e. Ansible, Elasticsearch)

I'd love to read about those. Do you have any write-ups/links describing those approaches?

@endorama
Copy link
Member

endorama commented Aug 1, 2022

I'm wondering if fragments can be an intermittent form of changelog, which is transformed into a built form (single changelog.yml) in the package building process. Of course, CI should acknowledge that this is a dynamic file.

Indeed! There is the concept of a "consolidated changelog", where all fragments are gathered together in a form similar to the current changelog.

@jsoriano
Copy link
Member

jsoriano commented Aug 1, 2022

One thing I would like to mention is that current changelog format is only expected to be like that for the final artifact. Each development team could decide how to build it. For example during development they could keep a dummy changelog, and build the final changelog to be included in the final package with whatever tool they want.
I think it can make sense to keep this approach, with a simple changelog format in the spec, but allowing developers to use the procedure they prefer to build it.

@endorama could https://github.com/elastic/elastic-agent-changelog-tool generate changelogs in the format of the package spec? Or we would need to modify the spec to accept the format generated by this tool?

We would still need to think though on how to solve pain points in the integrations monorepo. It may make sense to introduce the use of fragments there.

Also, to facilitate this kind of approaches, we would need to be less strict on what can be included in _dev directories, so developers can include fragments there.

Each of this PRs conflict with each other in 2 things:
* changelog.yml
* manifest.yml, field version

Do we need to modify the version on every change? It should be possible to use a prerelease suffix to combine multiple changes. If this is not possible now we may need to iterate on the experience on this.

It would be awesome if we don't need to update the changelog or fragments at all. For example, in the elastic-package project, we're using GoReleaser to generate it from commit messages. Personally, I find it really convenient as I don't have to care about it at all :)

Something like GoReleaser may not work so well for a monorepo like integrations, some layer would be needed to detect what changes belong to what integration, this may be complex. Also, GoReleaser creates a changelog entry for every change. I find this okish for developer tools, but for more user-facing I think it is better to decide what to include. So some manual curation is probably always going to be needed in packages.

@endorama
Copy link
Member

endorama commented Aug 3, 2022

could elastic/elastic-agent-changelog-tool generate changelogs in the format of the package spec? Or we would need to modify the spec to accept the format generated by this tool?

The tool has the concept of rendering a consolidated changelog. The current format we are focusing on is Asciidoc to be included in release notes, but is possible to support any format by adding the dedicated renderer.

Do we need to modify the version on every change?

No we do not, and we could improve the process around releasing packages, better managing and syncing multiple teams working on the same package. But I think there is greater leverage in addressing the issue without adding another process (which may also be very complex to manage for external contributors).

It should be possible to use a prerelease suffix to combine multiple changes.

I've yet to try this (as I'm still a bit confused around version changes and migration to package storage v2), but this would not really solve the conflict issue and would need a stricter release process for packages (complex with multiple data stream granularity ownership).

I think using fragments will help in flexibility and will remove a current source of rework in our PRs, while also helping in structuring our release process.

@endorama
Copy link
Member

Today discussing with @kaiyan-sheng and @gsantoro an use case emerged that the current changelog structure makes very hard: doing small related PRs incurs in multiple conflicts in the changelog file, which encourage a big PR approach and slows down iterating on integrations.

@endorama
Copy link
Member

endorama commented Aug 17, 2022

I need to iterate on this comment:

Do we need to modify the version on every change?

No we do not, and we could improve the process around releasing packages, better managing and syncing multiple teams working on the same package. But I think there is greater leverage in addressing the issue without adding another process (which may also be very complex to manage for external contributors).

As per development guidelines the package version must be updated every time, otherwise the package is not added to the snapshot storage, creating issues for package promotion. I can confirm this is still the case as I added multiple PRs to gcp 2.5.0 and the package has not been updated in package storage. Compare changelogs: [email protected] in integrations vs [email protected] in package storage.

@mtojek
Copy link
Contributor

mtojek commented Aug 22, 2022

I can confirm this is still the case as I added multiple PRs to gcp 2.5.0 and the package has not been updated in package storage. Compare changelogs: [email protected] in integrations vs [email protected] in package storage.

You need to follow a different workflow.

  1. Add a new version entry to changelog, for example: 2.6.0-next.
  2. Add PRs to the 2.6.0-next.
  3. Once EVERYTHING is merged, you need to raise another PR to bump up the manifest version to 2.6.0 and replace 2.6.0-next with 2.6.0.

I wanted to highlight that I don't undermine your complaint regarding version handling, rather the opposite, as I'm total with you that it isn't straightforward and concerning. I guess that we can rediscuss it and review your proposal on fragments.

@endorama
Copy link
Member

@endorama could elastic/elastic-agent-changelog-tool generate changelogs in the format of the package spec? Or we would need to modify the spec to accept the format generated by this tool?

Just adding that in tool v0.4.0 (still unreleased at the moment) we added support for custom templates when rendering a consolidated changelog. For templating text/template go package is used, so this use case is supported already by providing a template that renders to the package spec changelog template.
(in the long run we may want to embed a custom template to do this in the tool)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs discussion
Projects
None yet
Development

No branches or pull requests

4 participants