-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monorepo / Monobuild support? #936
Comments
[I'm just somebody driving by] I've had to deal with the monorepo (anti)pattern1️⃣ and I agree it'd be very helpful to have help in dealing with it and packaging it; but also have learned it's really hard to do good things for monorepo without making compromises to normal patterns. To me it sounds like this could be hard to get into Poetry, at least right from the start. It's easier to imagine another library where someone chooses an option like buck/pants/bazel and then creates a helper to make that work well with Poetry and vice/versa 1️⃣: it's not always an antipattern, I know. too often, it is, and many best practices are abandoned. So it can make it hard to develop monorepo-related features without specific good examples that are targetted for support. TLDR could be good to link to an OSS example (or contrive one and link to that) |
I understand. I hate the religious view taken towards monorepos. I have found what I call mini-monorepos to be useful, small repos that serve a single purpose. For example, in the Rust world, I have a package for generic testing of conditions, called I specially suggested continuing to follow after Cargo's model, like poetry has done in other ways, for monobuild support rather than getting into the more complex requirements related to tools like buck/pants/bazel (fast as possible, vendor-all-the-deps, etc). If someone needs the requirements of those tools, they should probably just use those tools instead. From at least my brief look, it seemed like they don't do a good job of interoperating with other python build / dependency management systems. |
Also would love support here! Specifically for developing a library that has different components each with their own set of potentially bulky dependencies (e.g. core, server type 1, server type 2, client, etc.). Also helpful when trying to expose the same interface that supports different backends (similarly, you don't want to install every backend, just the one you want). The only OSS library I could find that emulates this approach is toga -- it would be great if poetry could handle dependency resolution for these sorts of libraries. The toga quickstart explains how the dependencies are managed with a bunch of |
I'm interested in this also. How far does the current implementation of editable installs get you towards your use case? |
Use cases
Rust can solve this at two levels
So regarding the first, editable installs might cover this if you can mix path dependencies with version dependencies which is the key feature needed to partially handle my specified use cases. |
I tried some of this out today and it looks like the first feature you describe is nearly, but not quite supported. You can declare a dependency both as a dev and non-dev dependency, when you do this the dev version should take precedence (at least that's how I interpret this), allowing a package to be installed as editable during local dev. Then for final build the I've made it work with a toy example which contains a path dependency in dev and non-dev mode but not for a local vs pypi dependency. |
I am currently trying to find a way to build a Python monorepo without using heavy-duty tools like Bazel. I have seen a repository which uses https://github.com/pymedphys/pymedphys Using yarn you can declare workspaces, and using lerna you can execute commands in all workspaces. You can then use the I've yet to try this technique with Poetry (and the pymedphys repository does not use it), however I feel it might be worth exploring. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I still think this could be useful |
|
Something I am running into right now is that using Poetry with path dependencies is very unfriendly to the Docker cache. To correctly leverage the cache (read: don't install all dependencies from scratch on every build) I would want to install the dependencies first, and then copy my own code over and install it. However Poetry refuses to do anything (even I am considering writing my own parser for the One of two things could help me:
|
|
Any progress on this feature request? I've been using poetry for about a year now inside a monorepo in a research lab. We have a single global pyproject.toml that dictates the global set of dependencies since there isn't a way to have multiple separate pyproject.toml files (like cargo workspaces). Needless to say, this creates a lot of pain, because we continually experience problems when one developer wants a dependency that is incompatible with another developer's code. As a result, most of the researches go off the grid and don't use poetry at all, instead having their own venvs that only work on their machines. |
Related: #2270 @TheButlah this is something I wish to pick up sometime this year. Would be great to make sure the use case is listed in the issue. |
I've quickly made a prototype for handling monorepos, with success. In fact it nearly works out of the box already! Are you interested if I make a PR? I put here some unordered details and thoughts. I don't guarantee it'd resolve all cases, but at least it's fine for my needs. I'm basing this workflow on the one I use for JS/TS projects using yarn and lerna. Repo structure:
The idea is to have a private/virtual pyproject.toml at the root. Each package has its own pyproject.toml specifying the package name, version, dependencies, as usual. I could have created one virtualenv per package, but I find too cumbersome (I'm using VS Code and it would be very annoying to switch virtualenvs every time I change the file I'm working on), so I created a global virtualenv and that's the purpose of the root pyproject.toml. This file contains all the project packages as sole dependencies (using the format There is often the need for executing a command in every package directory. For instance, and this is already working, running On the CICD server, it makes sense to create one virtualenv per package, at least for one thing: checking that each package only imports other packages that are declared as its dependencies. In that sense, it may make sense for packages to have their own dev dependencies, e.g. if they have some specific tests which are not shared with others. I'm used to delegate version bumping to the CICD pipeline: every merged PR will automatically bump versions and publish packages, based on what has been changed (using conventional commits spec). Dependent packages need to be bumped as well: if bar depends on foo, bar hasn't changed but foo has some changes of any kind, bar still needs to get a patch bump. Actually this is only true if bar depends on the current version of foo: if an earlier version is specified, the dependency constraint is not updated. Lerna takes care of For publishing, lerna can retrieve the package versions on the registries, and publishes only the new ones. Here in Python this is tedious, since PyPi (warehouse) and pypiserver do not have a common route to get version info. My hack for now is just to publish everything, and I ignore Caveats: packages shall not declare conflicting dependencies if there's only one global virtualenv. PropositionAgain, what I'm proposing here addresses my needs, but this feature should fit it other use cases as well, so please give feedback about what you would need for your own workflow. I prefer not to use "monorepo" in the names, which is too coercive. New commands:
New [tool.poetry.packages]
paths = ["packages/*"] # used to specify where to find packages
version = "independent" # optional flag to let each package have their own version. Without this flag, all packages get the same version. |
Yes indeed Edit: Some hints in pymedphys/pymedphys#192 (comment). Have requested the author to shed some more light though. Edit 2: Thanks to SimonBiggs. pymedphys/pymedphys#192 (comment) |
@KoltesDigital I would really like to try out the modifications you have made to allow poetry to manage the dependencies for a monorepo. Might you be able to share a branch to see these changes? You might have already shared but I just could not find it. Thanks! |
@hpgmiskin thanks for your interest! My prototype is more a workflow PoC, I actually haven't changed poetry yet, instead I just added some scripts on top of it. These scripts are in JS/TS, and of course the final solution should only use Python. Moreover, I haven't implemented my whole proposition, and to do so I'll have to modify poetry. So before doing this work, I want to first have some feedback from contributors, in order to know whether they're ok with the direction I'm heading to. |
What about having Poetry correctly update its lock file when a sub-pyproject.toml file changes, without having to run Those commands are nice, but I'm unlikely to use them, and I'd rather see the monorepo use-case be properly supported as a first step, and the opinionated utility commands (and CI integration) added as a second step. |
@remram44 that alone would be a big step forward |
@remram44 it's actually what happens with the root-level pyproject.toml, it creates a single virtual environnement and leads to a single root-level poetry.lock. It's indeed the same with Cargo workspaces, and Yarn workspaces. And this is already working without any of my additions, you can try the repo structure I described. I also believe that poetry should not reinvent the wheel, and leave the things that CICD does best. That's why I mentioned that I prefer invoking git myself (i.e. the CICD takes care of that), because every CICD pipeline is different. But IMHO bumping versions of the subpackages is something everybody will need, that's why I propose to make this part into poetry. Or as an alternative, as a plug-in to poetry. After that, users can version, publish to you, etc. the way they want. |
If I run |
@remram44 exactly, that's why I've proposed new root-level commands My experience with monorepos is that the users should not run commands from subdirectories. If one were to |
Cargo gets this right, I don't know why you say this is unreasonable. The shortcomings of some tools are no argument to ignore the behavior of working tools. |
@remram44 interesting, I wasn't aware of this feature of Cargo. But well, we're having different views about what should the monorepo support look like. Our two views are different, but not exclusive, so let's have both. |
@alecandido Multiple PyPI (distribution) packages in the same project would indeed be a way and nice addition. I however see that as an extension to having packages refer to each other. Projects need to be able to refer to each other, and on top of that, one could add a container project, that distributes all those subprojects in one go. The downsides of distributing a single package, mainly dependencies & metadata, prevent me from moving to a monolith package. Now I resolve to independently built packages. But it comes with all the typical downsides of multi repo approach (MRs on multiple repo's, needing to update versions on each one every time) Just having independent projects referring to each other (thus in my example, having both a |
I see two different aspects within the poetry monorepo:
To workaround the missing poetry functionality regarding the second point, I've created an example repo at https://gitlab.com/gerbenoostra/poetry-monorepo . It implements two different approaches:
Any feedback and/or suggestions would be useful. |
I think when considering a monorepo strategy there are quite several issues to consider including dependency management and the delineation between components/libraries and projects. This blog post gets really close to my understanding of how a monorepo should be structured and maybe the discussion can build upon the work done there: |
Looks interesting. Especially the part related to dependency resolution as
described in https://pnpm.io/workspaces .
…On Thu, 1 Sept 2022, 11:36 Michael Oliver, ***@***.***> wrote:
I think pnpm <https://pnpm.io> from the JavaScript ecosystem really got
this right and many <https://pnpm.io/workspaces#usage-examples> popular
open source projects adopted it. It might be worth a look for some
inspiration.
—
Reply to this email directly, view it on GitHub
<#936 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABEGU5S6IKZTE5PHE4GM6H3V4B2J7ANCNFSM4G367AQA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
The code in this PR would enable Workspace support in Poetry. The architecture I would recommend for monorepos (if this feature would be enabled) is called |
Going to fold this into #2270 as we have two parallel issues -- feel free to continue the discussion here, but I'd like to formally track this category of feature in one place. |
Above I mentioned that I tried to work around poetry's limitations, and still use poetry in a mono repo. I've finally took some time to improve the repo, and also have written a blogpost to explain the approach. |
@gerbenoostra fwiw I went ahead with a plugin to do the alternate version mentioned in your blog post (modify the wheels from lock files as a post build / pre deploy step). its got some additional primary goals (freezing versions in wheels) https://github.com/cloud-custodian/poetry-plugin-freeze |
As another alternative, there's a plugin that handles the wheel & sdist packaging by doing work just before the If you want to know more, here's the docs (Multiproject is of course referenced in there): https://davidvujic.github.io/python-polylith-docs/ |
A really nice to the point solution without other assumptions. |
Nice to see multiple initiatives here. I like the single command here, just build it and one get's a full library.
|
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Feature Request
The goal is to allow a developer to make changes to multiple python packages in their repo without having to submit and update lock files along the way.
Cargo, which poetry seems to be modeled after, supports this with two features
cargo test --all
will run tests on all packages in a workspace.It looks like some monobuild tools exist for python (buck, pants, bazel) but
The text was updated successfully, but these errors were encountered: