Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to generate run_exports.json file #102

Closed
2 tasks done
Tracked by #5
jaimergp opened this issue Apr 14, 2023 · 9 comments
Closed
2 tasks done
Tracked by #5

Option to generate run_exports.json file #102

jaimergp opened this issue Apr 14, 2023 · 9 comments
Labels
locked [bot] locked due to inactivity type::feature request for a new feature or capability

Comments

@jaimergp
Copy link
Contributor

Checklist

  • I added a descriptive title
  • I searched open requests and couldn't find a duplicate

What is the idea?

In this issue I am asking for a CLI flag / option to generate a run_exports.json file, next to repodata.json and friends, much like channeldata.json is handled.

Once available, this flag could be enabled on Anaconda.org (at least for some channels).

Why is this needed?

Building infrastructure (such as some conda-forge bots) use run_exports to calculate which packages need rebuilding as part of an upgrade. Right now, conda-forge needs to maintain their own JSON database, which involves downloading and extracting the new artifacts as available.

What should happen?

When invoked with the new flag (e.g. --run-exports), conda-index would also generate a
run_exports.json file, which would be placed next to repodata.json, containing the run_exports metadata from each artifact:

{
  "info": {
    "subdir": "osx-64"
  },
  "packages": {
    ...
    "assimp-5.2.5-h276577b_0.tar.bz2": {
      "run_exports": {
        "weak": ["assimp >=5.2.5,<5.2.6.0a0"],
        "strong": [],
      }
    },
    ...
  }
}

To be discussed: run_exports is not a very common field (most packages do not have it), so it might make sense to only generate entries for those artifacts whose run_exports are populated.

Additional Context

channeldata.json already offers some run_exports metadata. However, this is insufficient because for some reason it is flatted on a per-version basis, which doesn't cover build updates.

Having a separate file for run_exports would also allow conda-forge to patch it like it is done with repodata.json, saving the hassle (and CI resources) of rebuilding a package just to amend some metadata.

Note: adding this field to repodata.json is also an option, but conda clients do not really use this at install time. It would only be used by conda-build and the surrounding infrastructure.

@jaimergp
Copy link
Contributor Author

Gentle ping @conda/conda-index and @conda/conda-core :)

@chenghlee
Copy link

chenghlee commented Apr 25, 2023

I've added this as a topic of discussion for this week's community meeting. I like the idea of providing per-artifact run_exports, but my brain is still post-PyCon mush mode to properly think through the pros & cons of having a new file vs. an extension in repodata.json.

@jaimergp
Copy link
Contributor Author

Thanks @chenghlee! Let me start a list we can iterate on:

repodata.json run_exports.json
✅ Already existing file ✅ Better separation of concerns
✅ Less changes in conda-index, CDN infra? ✅ conda clients do not need to worry about an extra file
❌ Bigger repodata files for data not used by conda clients (although this is minimized by JLAP) ❌ Might require more efforts in conda-index, CDN infra and other projects
❌ Might require updates in tools that load repodatas with schemas

@dholth
Copy link
Contributor

dholth commented Apr 25, 2023 via email

@jaimergp
Copy link
Contributor Author

We have not been able to hotfix or repodata-patch this field in the past, and it has been a limitation in conda-forge too. Enabling that would allow us to skip some unnecessary rebuilding (sometimes we have to rebuild just to patch run_exports).

@jaimergp
Copy link
Contributor Author

CEP-12 is now a thing and the community approved this feature.

The gist is:

  • run_exports.json will be served next to repodata.json; aka one file per subdir.
  • All files need to be listed, even if no run_exports data is available for them.
  • This file won't be patched for now.

I'm happy to start a PR targeting this, but let me know how you'd like to proceed or if you have any pointers, @dholth! Thanks!

@dholth
Copy link
Contributor

dholth commented Aug 14, 2023 via email

@jaimergp
Copy link
Contributor Author

To ensure that all files indexed are indeed contained in the run_exports.json. Otherwise we could run into questions like "is this file out of date or is there a bug in the processing pipeline?". This also allows us to use it without having to check repodata.json for the full list.

@jaimergp
Copy link
Contributor Author

jaimergp commented Sep 5, 2023

Closed by #110

@jaimergp jaimergp closed this as completed Sep 5, 2023
@github-project-automation github-project-automation bot moved this from 🆕 New to 🏁 Done in 🧭 Planning Sep 5, 2023
@github-actions github-actions bot added the locked [bot] locked due to inactivity label Aug 27, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 27, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked [bot] locked due to inactivity type::feature request for a new feature or capability
Projects
Archived in project
Development

No branches or pull requests

3 participants