-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[suggestion] split the composer package into separate components? #808
Comments
Hi @Alorel, This is something we've considered but that would be difficult to implement cleanly in a backwards compatible way. One thing we have yet to figure out is if this approach would be any faster: instead of just resolving a single dependency, Are you seeing any particular issues that you believe a smaller dependency would solve? |
@jeskew It's mostly about personal preference on not having to download packages you don't need. In my specific use case (which is not, by far, applicable to a lot of users) having fewer components to download would be a significant speedup as my company requires its employees to live-sync all changes to a virtual host on the development server instead of using localhost. Because Composer's update process involves fully removing a package and then simply replacing it with a new one, the IDE ends up taking a lot of time deleting the entire SDK from the remote server and then re-uploading the new version even though, in many cases, only a few files have changes in them. |
@Alorel It sounds like you might benefit from using rsync -c or something else that checksums files before replacing them instead of SCPing or SFTPing the whole kit and kaboodle. I'm going to leave this issue open in case anyone else wants to chime in on why they would or would not want a modularized SDK. |
Pros:
Cons:
Generally speaking, I'd favor keeping packages specific to their function/purpose instead of lumping everything together. Obviously, there are drawbacks. I'm probably overlooking other issues/points. |
More Pros:
More Cons:
|
For people who don't use Composer, this change would be invisible. We would still distribute the SDK as a phar. |
Having a single package bloats autoloading and actually has runtime performance impacts, which in my opinion are equally important to dev time comfort. |
+1 |
I'd love to see this split on composer too. I'm not sure it's worthy like everybody said, but I believe most of the applications use just few services. For my case I use S3 and DynamoDB only. |
+1 |
1 similar comment
+1 |
+1 |
+1 |
+1 We've just had a situation where composer gave up trying to install aws-sdk because it timed out and left us with a broken package that broke the site completely. Fortunately we had a copy of it elsewhere and was able to upload it manually. Like others, we really only use S3 so making a separate package for it makes a huge amount of sense to me. If you can publish the dependency graph for S3 usage, I'd even be prepared to write a script that repackages just the stuff required. |
I've got a package that just takes in the S3 stuff now with a utility to run to check for updates to the main package. Be useful if one of the aws guys could review it? |
+1 |
I also agree about splitting the SDK into components. This would still be possible with a simple git subtree, exactly like what @symfony does. Then you can keep the same workflow for upgrades, versions, etc., and hook in the git/github process to apply the modifications to the subsplits. Maybe you could watch this conference: https://www.youtube.com/watch?v=ZVsDA6GhKOU to get some inspirations 🙂 |
+1 Hundreds of largely unrelated components and services rolled into one package. If there's a breaking change to any one of these, every user of any of these hundreds of dependencies will be affected. I mean, for crying out. This is not how Composer works. This is just not how it's done. I'd have thought a huge company like Amazon would know better 🙄
I'd like to not that this is a really bad idea as well - it only solves the somewhat esoteric problem of deploying too many files (which is mostly not really a problem, assuming you have byte-code caching enabled, and you let Composer generate an optimized auto-loader, which you should be doing in production anyhow) and doesn't address the versioning issue. (all of Symfony's sub-packages get bumped whenever the framework master package gets bumped, regardless of whether there's any breaking changes in the sub-trees, the actual packages - these should have been individually versioned to ensure a meaningful upgrade process for developers; it's just laziness, really.) I can't really believe I'm out here explaining this to Amazon 😐 |
It's not laziness. It's just that the framework ( There are 3 active branches for Symfony, so this means that each month, at least 150 git tags are created, bugs on lower branches should be ported to more recent branches if the bug exists, so in the end, the majority of packages will benefit from at least one new commit on each release. I think having such system for AWS could be almost painles, since AWS don't really need full BC and don't seem to need to support "older" branches, therefore there will only be one tag per package, and global maintenance can still be in the main repo (and it must, actually). Fabien Potencier created splitsh to ease the setup of such system, but It saddens me that we have to download 11MB of package when the SDK releases a new version, while we could download like 200KB instead for just the S3 client (it's more than 50 times lighter!). |
I would usually that size doesn't matter that much, but for things like AWS Lambda size DOES matter. The SDK keeps getting bigger with new services, while often only a few are required. A split could still let this package exist like this, but also create stand alone versions. This main repo would just 'replace' all split versions so doesn't break /conflict existing apps. It does require either a core/common package for things like http/auth, bit those depencies can be configured once. |
We are in an era where bandwith is really becoming important. Even @symfony made a small change to make their downloaded packages smaller (symfony/symfony#33579), I think AWS should also embrace some new practices. |
Is this going to be something we can expect in the futur ? |
Running PHP on lambda makes this problem much more critical: a larger vendor directory implies longer cold starts, longer deployments and more chance to hit the Lambda max disk size. Is there any way we can help? |
I would help to if needed |
One of the most annoying things that AWS does is include the .changes directory in the package. Totally unnecessary. It's documentation and can live elsewhere. I adopted the practice of building rpm packages for my php applications a long time ago to ease rollbacks and ensure non-php dependencies are met. I simply exclude the .changes directory from the package build (using fpm, if you haven't come across it.) This is simple to do, but without some way of creating a dependency graph of the required files it would become an irksome task to apply across the whole aws package. We've even considered installing that 'centrally' on our servers and symlinking to it for the production builds, which I used to do back in the days of hosting multiple clients running the same core code. Won't help @mnapoli with Lambda, but where you are running multiple services on a single server, it might be a way out until AWS get their act together. |
@chippyash that's interesting! What is the |
@mnapoli .changes is just a a list of package changes, one per release by the looks of it. And it's big. Totally unnecessary. Ditto for vendor/aws/aws-sdk-php/.github directory. Not even sure why that is in the repo at all. |
If we can get a maintainer to confirm that these directories are not useful at runtime, I can send a pull request excluding them. @Alorel maybe? |
There is a Pull Request open already: #1949 |
Awesome thanks! |
Installing the SDK currently takes 33MB on disk (as reported by Such huge package size is a drawback in any place where the size of the source code matters (AWS lambda is such a place, but any place relying on containers also prefers getting them smaller). So such split should be considered (the impact on BC is for classes which don't belong to a service, but this could potentially even be kept in their current namespace even if they get moved to a separate folder in the mono-repo with some PSR-4 config) Regarding releases, Symfony chose to release all packages in sync. But it could also be possible to release them separately instead, by releasing only modified packages. |
@stof Do you know if that data directory is required to actually run the library in production, or just there as another documentation and/or test aide for AWS? If not required, I can add it to my ignore list when building packages. [edit] |
@chippyash it is also used by the Client class to know which endpoints are available in each service and what are the expected arguments for each of them. |
That's a shame. |
Hmmm.. v3.171.6 today....
114Mb - gulp. |
I know that this is a super old issue. I also know that there is a lot of reasons why this has not been fixed yet. If you are one of the people really need a small AWS client, there is an AsyncAws organisation that (among other things) address this issue. See https://github.com/async-aws/aws Normally I would never promote a competing library like this, especially one that I've created. However, any AWS API client is helping the user to buy more AWS products. So in that sense, AsyncAws is not a competing library, just a complement.
Thank you for appreciating my PR. =) @PhilETaylor, Make sure you download the non-source version. Ie:
|
Is it not possible to generate the required data (endpoints) in a compact PHP format, with a build script for each release, and exclude the data folders? |
Hi everyone, |
I'm using Docker, and want to minimize the container size as much as possible, so I figured a temporary solution would be to delete unnecessary files after composer has installed them. Is there a safe way to determine which files can be deleted if you know which services you need? |
@barasimumatik that's a tough question, but I'll do my best to answer. It would be nearly impossible for me to test this without a more specific use case, so this may not apply to your personal container.
I have only done some minimal testing on this, so delete them at your own risk. I hope this helps :) |
@SamRemis wow, thank you very much! The SDK dwarfs any other composer dependency I have, and accounts for about 15% of the whole container image right now, so it will definitely help 😄 |
note that if you install from dist, the |
@SamRemis splitting out the AWS PHP package has been on the backlog for a bit now, is there many any update regarding this split out? We wish to only use the S3Client for instance, but that does require the full install. Thanks! |
That's not something that's currently being worked on for this version of the SDK unfortunately; the way it's designed would make that very difficult. |
Like @SamRemis mentioned above, this isn't something we're in a position to do currently, but it's on the horizon via the next major version. We recently had a similar feature request which we've converted into a discussion regarding reducing the V3 package size. We're trying to gauge interest in a feature that cleans up unused services and/or unnecessary directories, but we're open to other suggestions. |
Closing this since modularization is planned for the next major version. We'd still like to get feedback or suggestions related to reducing the v3 package size (see my comment above)— the hyperlink will take you to a discussion we have open on the topic. |
|
Hi!
Our company only uses the S3 and SQS libraries, however, in order to have these installed via Composer, the entire AWS package needs to be downloaded, inflating Composer's autoload files. Would it perhaps be possible to split the AWS SDK into separately installable components (e.g.
composer require aws/s3-php-dk
) which would then have some requirements of their own (e.g. the s3 package would require aws/common-php)?The text was updated successfully, but these errors were encountered: