Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce footprint #34

Closed
Enchufa2 opened this issue Dec 4, 2016 · 6 comments
Closed

Reduce footprint #34

Enchufa2 opened this issue Dec 4, 2016 · 6 comments

Comments

@Enchufa2
Copy link

Enchufa2 commented Dec 4, 2016

I've prepared this script to remove and purge unwanted files from git history. You can use it to remove those big tar.gz as follows:

./git-remove.sh [email protected]:eddelbuettel/bh.git tar.gz

You'll be asked for confirmation before removing anything, and then, if everything went ok, changes will be automatically pushed. I've already tried it with a fork (check it) and it worked nicely (310.61 MiB -> 16.49 MiB).

@eddelbuettel
Copy link
Owner

eddelbuettel commented Dec 4, 2016

I will give this a try. I looked into it once using the Java-based tool that is often recommended, but didn't like the outcome much (which is in a private repo on gitlab).

Your first key operation appears to be (and I am indenting here)

# Find the files you want to remove
FILE_LIST=$(git rev-list master | \
     while read rev; do git ls-tree -lr $rev  | \
     cut -c54- | sed -r 's/^ +//g;'; done  | \
     sort -u | perl -e 'while (<>) { 
                    chomp; \
                    @stuff=split("\t");$sums{$stuff[1]} += $stuff[0];} 
                    print "$sums{$_} $_\n" for (keys %sums);' | \
     sort -rn | grep $EXT)

and I see no test here. Does it first create a list one is then supposed to edit? We could easily have a condition here (ie drop files over 5mb each ...)

At the end:

"or to clone a fresh copy. And don't push that again! ;-)"

makes little sense. From the fresh copy one should be able to push, no?
"Just don't push from the old one" ?

@eddelbuettel
Copy link
Owner

Also reference #25 here

@eddelbuettel
Copy link
Owner

Now I feel silly -- I didn't see the EXT argument at first and its use. All clear now.

I will give this a try, maybe later today.

@Enchufa2
Copy link
Author

Enchufa2 commented Dec 4, 2016

Now I feel silly -- I didn't see the EXT argument at first and its use. All clear now.

My fault for that huge one-liner. :-) Yes, basically, that line lists all the files, sorts them and filters by your given pattern. Then, FILE_LIST is printed and you'll be asked for confirmation. You'll see a lot of stuff going on (don't worry) and, if everything is ok (the script lists all files again to see if they were removed), you'll be prompted again to push all the changes.

"or to clone a fresh copy. And don't push that again! ;-)"
makes little sense. From the fresh copy one should be able to push, no?
"Just don't push from the old one" ?

Just a joke. ;-) I mean don't push the purged files again.

@Enchufa2
Copy link
Author

Enchufa2 commented Dec 4, 2016

I've just indented those one-liners.

@eddelbuettel
Copy link
Owner

eddelbuettel commented Dec 5, 2016

So I just tried with the 'backup' copy I had over at gitlab and it ends badly:

Files successfully removed! Let's push the changes...

Are you sure? [y/N] y

Counting objects: 21443, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (7144/7144), done.
Writing objects: 100% (21443/21443), 13.00 MiB | 701.00 KiB/s, done.
Total 21443 (delta 13986), reused 21443 (delta 13986)
remote: GitLab: You are not allowed to force push code to a protected branch on this project.
To [email protected]:eddelbuettel/bh.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to '[email protected]:eddelbuettel/bh.git'
Everything up-to-date

Thoughts?

Edit: That was a gitlab thing. Unprotected the branch, and it proceeded. Very nice script -- thanks!

I now have 142mb in a fresh clone of the "pruned" repo and 822mb in the original.

Edit 2: And also 140-ish mb in the original repo once I remove the local (old) bh tarballs. All good.

Nice work --thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants