Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Delete old tags after repo migration. #109

Open
julianrex opened this issue Dec 17, 2019 · 6 comments
Open

Delete old tags after repo migration. #109

julianrex opened this issue Dec 17, 2019 · 6 comments

Comments

@julianrex
Copy link
Contributor

julianrex commented Dec 17, 2019

In the process of migrating from mapbox-gl-native we kept the git history (see this comment).

A clean recursive clone of this repo is currently 1.5 Gb in size, of which the .git folder contributes 1.1 Gb.

We are proposing to filter the git history to exclude history for paths that are no longer part of this repo (but remain in mapbox-gl-native) - using git-filter-repo. The aim is to reduce the size of the .git folder for this and related repos.

In addition, we will investigate & document instructions for a sparse checkout, since mapbox-gl-native test fixtures also take up significant disk space.

@1ec5
Copy link
Contributor

1ec5 commented Dec 17, 2019

In case it isn’t clear, this process will involve rewriting history and force-pushing, which means all commit hashes will change and master will have an unrelated root.

@nishant-karajgikar
Copy link
Contributor

Git Filtering Analysis

I ran a few experiments with git filter repo and with deleting tags, so as to get an understanding of how much size we could potentially cut down on. The numbers are below.

Before cloning submodules

  1. After git clone https://github.com/mapbox/mapbox-gl-native-ios.git, the .git directory is 191 MB (182,917,908 bytes)
  2. Running git count-objects --verbose -H shows a size-pack of 182.7 MB
  3. Running git filter-repo --analyze shows the following overall statistics :
      Overall Statistics ==
      Number of commits: 16536
      Number of filenames: 30662
      Number of directories: 4856
      Number of file extensions: 156
    
      Total unpacked size (bytes): 1447033869 (1.44 GB)
      Total packed size (bytes): 188236110 (188 MB)

After cloning submodules

  1. After git submodule update --init --recursive, the .git directory is 1.09 GB (1,049,371,971 bytes for 2406 items)
  2. Running git count-objects --verbose -H shows a size-pack of 182.7 MB
  3. Running git filter-repo --analyze shows the following overall statistics :
      == Overall Statistics ==
      Number of commits: 16536
      Number of filenames: 30662
      Number of directories: 4856
      Number of file extensions: 156
    
      Total unpacked size (bytes): 1447033869 (1.44 GB)
      Total packed size (bytes): 188236110 (188 MB)

After ONLY deleting tags

  1. After deleting all local tags using git tag | xargs git tag -d , the .git directory is 1.07 GB
  2. Running git count-objects --``verbose -H shows a size-pack of 182.7 MB
  3. Running git filter-repo --analyze shows the following overall statistics :
    == Overall Statistics ==
      Number of commits: 15218
      Number of filenames: 11151
      Number of directories: 2087
      Number of file extensions: 139
    
      Total unpacked size (bytes): 1141210807 (1.14 GB)
      Total packed size (bytes): 119005466 (119 MB) 

After ONLY Filtering Git Repo

  1. Ran git-filter-repo --strip-blobs-bigger-than 30M --invert-paths --path platform/android/ --path android/ --path platform/node/ --path platform/linux/ --path include/ --path render-test/ --path scripts/android/ --path src/mbgl/ --path test/ --path vendor/sqlite/ --path benchmark/
  2. Size of .git directory is 976 MB
  3. Running git count-objects --verbose -H shows a size-pack of 110 MB
  4. Running git filter-repo --analyze shows the following overall statistics :
    == Overall Statistics ==
      Number of commits: 9606
      Number of filenames: 24609
      Number of directories: 3885
      Number of file extensions: 134
    
      Total unpacked size (bytes): 931002004 (931 MB)
      Total packed size (bytes): 121966973 (122 MB)

@1ec5
Copy link
Contributor

1ec5 commented Jan 10, 2020

It makes sense that deleting tags since we changed the branching strategy in mid-2018 would have an impact, because deleting these tags prunes many parts of the graph that are otherwise inaccessible. (We already deleted the various release branches, after all.) I think we’d have to delete or rewrite the old tags anyways as long as we do any filtering of the repository; otherwise, the stale tags would defeat the purpose of filtering.

Can you measure the impact of doing both – filtering the repository history and deleting old tags?

@nishant-karajgikar
Copy link
Contributor

@1ec5 , I wasn't aware of this. Will update this thread with my findings.

/cc @julianrex

@nishant-karajgikar
Copy link
Contributor

nishant-karajgikar commented Jan 10, 2020

After deleting tags AND filtering Git repo

  1. Ran git tag | xargs git tag -d after recursive clone.
  2. Ran git-filter-repo --strip-blobs-bigger-than 30M --invert-paths --path platform/android/ --path android/ --path platform/node/ --path platform/linux/ --path include/ --path render-test/ --path scripts/android/ --path src/mbgl/ --path test/ --path vendor/sqlite/ --path benchmark/
  3. Size of .git directory is 940.6 MB
  4. Running git count-objects --verbose -H shows a size-pack of 52 MB
  5. Running git filter-repo --analyze shows the following overall statistics :
== Overall Statistics ==
  Number of commits: 8908
  Number of filenames: 5127
  Number of directories: 1119
  Number of file extensions: 113

  Total unpacked size (bytes): 646948604 (646.9 MB)
  Total packed size (bytes): 53946154 (53.9 MB)

@julianrex
Copy link
Contributor Author

julianrex commented Jan 21, 2020

The decision has been made to:

  • Remove old tags (pre fork)
  • Remove old releases from this repo (pre fork)
  • Tag the commit that was forked

This is planned for the week of February 10th - 14th.

@julianrex julianrex changed the title Filter git history after repo migration. Delete old tags after repo migration. Jan 21, 2020
@julianrex julianrex assigned 1ec5 and unassigned nishant-karajgikar Feb 24, 2020
@chloekraw chloekraw removed this from the Release Unicorn milestone Mar 12, 2020
@knov knov unassigned 1ec5 Jul 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants