-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repo Size #13699
Comments
I would like to mention bfg https://rtyley.github.io/bfg-repo-cleaner/ which has worked very well for me. |
Agreed on a lot of this, and I think this is certainly worth considering, but your initial numbers are a little off. We recommend using a shallow clone (e.g. I think you raise some work that's worth doing, specifically investigation into whether we can avoid bundling Chrome. As far as the others:
(not as concerning, and in cloning you only get the master branch, right?)
I get a little squeamish about re-writing history, but if this can be done cleanly (e.g. on a test repo/fork of Gatsby perhaps?) I'd be open to it.
This is for truly huge files, right? I'm not sure we have many that would be worthy of this. Also - I want to keep ease of contribution in the forefront of our minds. In general, cloning is a one-time, sunk cost, so I'd urge us to not make changes that improve the initial set-up cost, but degrade or complicate the experience down the line.
We feel pretty strongly that whenever possible we want to keep content in the monorepo, so same idea here! |
@DSchau Yes, using Cleaning up branches will help a full clone. Git makes copies of all changed files...any branches with commits will add to the size of the repository as a whole. I agree that we can't trade long-term usability for upfront costs. I won't worry about LFS, or additional repo solutions. One thing your repo size from using I think right now the things we should focus on are:
|
Netlify has recently released a new feature for serving large images, I'm not sure if it is for large files too. I see we're hosted on Netlify so we can make use of that feature. |
If you really do want large files in a Git repo (instead of using something like Git LFS), another option is to store the large assets in a separate repo and pull it in as a submodule. Then at least people that don't want/need the static files can just avoid updating the submodules when pulling. That Netlify feature looks pretty useful though! |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Just came around here since I am on a new machine. Can definitely confirm:
Had to edit: Fetching and checking out the FETCH_HEAD did not help, still could not rebase/merge. |
@eyalroth brought this up as well in #16889
|
relatedly the initial build time is pretty long from what i recall. no interest in firing it up again to check 😅 |
@sw-yx yeah the site build is rediculous. |
Just to add: The There's no way (that I'm aware of) to condense an already cloned repo's history. Sure, I could re-clone, but that's an involved process that adds to SSD wear-and-tear as well as network traffic for anyone on metered connections. I think, as an open-source project, Gatsby should weigh the concerns of all developers including those that might have limited storage or network traffic. I largely agree with the concept of a monorepo for related code like the official plugins, but I don't see a clear benefit to including the docs, starters, and www in the main repo. Additionally, with those split out, it would be potentially easier to look through the commit history and see actual code changes. Finally, GitHub suggests repositories be kept under 1GB, so this is an issue worth considering as this repo approaches that threshold: |
I agree completely @cpboyd. With the i18n projects being put in their own repos that'd make sense to move at least docs and www into their own. |
I'll close this issue as mostly resolved, as most of the proposed solutions are done.
|
Summary
Cloning the Gatsby repository is becoming a little absurd due to its size. The git clone transfers 550 MB compressed and uncompressed on disk the repo is 854 MB ( this takes several minutes to clone even on a decent connection). There has been one attempt to fix this in the past(#6486) though I'm not sure if they rewrote git history to purge the large files.
Getting this reduced would help everyone but would also encourage contributions from countries and areas where slower connections are the norm. At this point the repository size is not helping.
Relevant information
There are 92 files in the repo over 1mb.
Compressed (by git) size: 550 MB
Size on Disk: 850+ MB
Optional Solutions
Clean up stale branches that are not needed
My thought was starting by deleting branches that have merge or closed PRs. Anything that hasn't had a PR + is older than a certain time maybe we delete (or maybe give the author notice it will be deleted). Anything with an open pr can be left.
Cleanup detatched commits and other git things that don't matter -
I've run across this but I don't entirely understand what it is doing and if there are other things that could be done
Compressing images and purge from history - http://blog.jessitron.com/2013/08/finding-and-removing-large-files-in-git.html
Troubles here is people might just keep adding large files...Not sure if it's possible to write a script that gets triggered by git hooks to compress any images being added to the repo.
fix gatsby-plugin-screenshots to not need to bundle chrome - @Ankcorn
Chrome bundle is 44MB.
Git-LFS - this has been brought up before and we'd need to look into the affect on ease of contributions.
Move images out of the repo -
If LFS isn't an option maybe moving to CMS like contentful that could handle assets would be a better alternative. If that's not an option the website/images/blog cloud be move to its own
repository.
Prompts
What methods are we okay to move ahead with?
If a method has been given the go ahead and you want to tackle a method let us know and submit a PR...
What other options are there for reducing repo size that we can consider?
The text was updated successfully, but these errors were encountered: