Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vendor/bundle/ruby: cleanup unneeded files #15954

Merged
merged 1 commit into from
Sep 27, 2023

Conversation

Bo98
Copy link
Member

@Bo98 Bo98 commented Sep 4, 2023

  • Remove gems that don't need to be vendored (e.g. rubocop-*).
  • Slim down ActiveSupport even further now that we use less of it.
  • Commit license files of gems we do vendor, where available.

Draft because this requires #15953.

@MikeMcQuaid
Copy link
Member

  • Remove gems that don't need to be vendored (e.g. rubocop-*).

These do need to be vendored for VSCode editor support.

  • Slim down ActiveSupport even further now that we use less of it.

🎉

  • Commit license files of gems we do vendor, where available.

https://github.com/github/licensed may be an alternate option here and we could commit a single NOTICES file.

@apainintheneck
Copy link
Contributor

  • Remove gems that don't need to be vendored (e.g. rubocop-*).

These do need to be vendored for VSCode editor support.

How does this work in practice? I use VSCode but whenever I want to lint things I just run brew style.

@MikeMcQuaid
Copy link
Member

How does this work in practice? I use VSCode but whenever I want to lint things I just run brew style.

Most VSCode RuboCop plugins will just run rubocop and show up results in the editor.

@Bo98
Copy link
Member Author

Bo98 commented Sep 14, 2023

How it works in practice is we configure the path:

"ruby.rubocop.executePath": "Library/Homebrew/shims/gems/",

and we have this shim: https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/gems/rubocop

which calls: https://github.com/Homebrew/brew/blob/master/Library/Homebrew/dev-cmd/rubocop.sh

which invokes brew install-bundler-gems internally (so no need for vendoring)

and then it invokes a special RuboCop startup script (we don't use RuboCop's own binary): https://github.com/Homebrew/brew/blob/master/Library/Homebrew/utils/rubocop.rb.

@Bo98 Bo98 marked this pull request as ready for review September 14, 2023 21:41
@Bo98
Copy link
Member Author

Bo98 commented Sep 15, 2023

Hmm it seems brew install-bundler-gems doesn't install these again once removed. I'll need to look into that but I'll maybe fix that in a separate PR (that will of course be merged first) so consider this PR ready for review otherwise.

I think the problem is the directories aren't fully removed with this commit because some readme etc files were ignored, so the gem directory is still there. bundle check does not check the integrity of abitrary files within the gem directories.

Worst case scenario is we just do the gitignore change and gradually these will be removed over time with gem updates.

@Bo98 Bo98 marked this pull request as draft September 15, 2023 02:24
@MikeMcQuaid
Copy link
Member

Hmm it seems brew install-bundler-gems doesn't install these again once removed. I'll need to look into that but I'll maybe fix that in a separate PR (that will of course be merged first) so consider this PR ready for review otherwise.

I guess we could look at e.g. once file for each gem we need and run bundle install in that case.

Worst case scenario is we just do the gitignore change and gradually these will be removed over time with gem updates.

Sounds good!

@Bo98
Copy link
Member Author

Bo98 commented Sep 22, 2023

Worst case scenario is we just do the gitignore change and gradually these will be removed over time with gem updates.

This wouldn't work because it'll cause uncommitted changes if bundle removes those gems.

I guess we could look at e.g. once file for each gem we need and run bundle install in that case.

I did come up with something like this:

diff --git a/Library/Homebrew/utils/gems.rb b/Library/Homebrew/utils/gems.rb
index 25c10b29f..aed4621d3 100644
--- a/Library/Homebrew/utils/gems.rb
+++ b/Library/Homebrew/utils/gems.rb
@@ -22,6 +22,11 @@ module Homebrew
     File.join(ENV.fetch("HOMEBREW_LIBRARY"), "Homebrew", "Gemfile")
   end
 
+  # @api private
+  def bundler_definition
+    @bundler_definition ||= Bundler::Definition.build(Bundler.default_gemfile, Bundler.default_lockfile, false)
+  end
+
   # @api private
   def valid_gem_groups
     install_bundler!
@@ -29,7 +34,7 @@ module Homebrew
 
     Bundler.with_unbundled_env do
       ENV["BUNDLE_GEMFILE"] = gemfile
-      groups = Bundler::Definition.build(Bundler.default_gemfile, Bundler.default_lockfile, false).groups
+      groups = bundler_definition.groups
       groups.delete(:default)
       groups.map(&:to_s)
     end
@@ -229,7 +234,35 @@ module Homebrew
       bundle_check_failed = !$CHILD_STATUS.success?
 
       # for some reason sometimes the exit code lies so check the output too.
-      bundle_installed = if bundle_check_failed || bundle_check_output.include?("Install missing gems")
+      bundle_install_required = bundle_check_failed || bundle_check_output.include?("Install missing gems")
+
+      # Check if the install is intact. This is useful if any gems are added to gitignore.
+      # We intentionally map over everything and then call `any?` so that we remove the spec of each bad gem.
+      bundle_install_required ||= bundler_definition.specs.map do |spec|
+        spec_file = "#{Gem.dir}/specifications/#{spec.full_name}.gemspec"
+        next false unless File.exist?(spec_file)
+
+        cache_file = "#{Gem.dir}/cache/#{spec.full_name}.gem"
+        if File.exist?(cache_file)
+          require "rubygems/package"
+          package = Gem::Package.new(cache_file)
+          def package.verify # disable verification
+            true
+          end
+          contents = package.contents
+
+          package_install_intact = contents && contents.all? do |gem_file|
+            File.exist?("#{Gem.dir}/gems/#{spec.full_name}/#{gem_file}")
+          end
+          next false if package_install_intact
+        end
+
+        # Mark gem for reinstallation
+        File.unlink(spec_file)
+        true
+      end.any?
+
+      bundle_installed = if bundle_install_required
         if system bundle, "install"
           true
         else

Which is effective at doing a full verification. It does add half a second to no-op brew install-bundler-gems though, which may be noticeable, as it is reading a few .tar.gz archives nested in .gem files.

So a couple paths forward we could do:

  • Have some sort of global "vendored gem version" that we bump everytime we remove a committed vendored gem, since we have done something like this before: e16f4ef. It'll do the full check of everything once and never again until the next bump.
  • Have a hack specific to this PR that picks one file removed from each gem here and checks those. This is a hardcoded list that we would manually remove from when no longer necessary.

@MikeMcQuaid
Copy link
Member

Thanks @Bo98, great work as usual!

It does add half a second to no-op brew install-bundler-gems though, which may be noticeable, as it is reading a few .tar.gz archives nested in .gem files.

This seems pretty reasonable to me given it makes things considerably more reliable. If that can get slimmed down at all: great, otherwise I think it's not a big deal.

  • Have some sort of global "vendored gem version" that we bump everytime we remove a committed vendored gem, since we have done something like this before: e16f4ef. It'll do the full check of everything once and never again until the next bump.

This seems also reasonable but would want to ensure it's enforced by CI i.e. if you forget to bump (or bump unnecessarily): something in CI fails. This would be a blocker on doing this.

Suggestion: do the above for now and someone can improve performance once it's working.

@Bo98 Bo98 force-pushed the vendor-cleanup branch 2 times, most recently from 431e7fc to 9fcfbfa Compare September 26, 2023 17:37
@Bo98
Copy link
Member Author

Bo98 commented Sep 26, 2023

This seems also reasonable but would want to ensure it's enforced by CI i.e. if you forget to bump (or bump unnecessarily): something in CI fails. This would be a blocker on doing this.

I took a stab at it.

Despite the initial complexity, 🤞 it should make these type of things trivial to deal with in the future, instead of scratching heads for a couple weeks like I have here over something I only noticed when I was switching between branches (CI was passing).

@Bo98 Bo98 marked this pull request as ready for review September 26, 2023 17:53
Copy link
Member

@MikeMcQuaid MikeMcQuaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, great work @Bo98!

Comment on lines +47 to +77
run: |
{
echo "vendor-version=$(cat ../.homebrew_vendor_version)"
echo "ignored<<EOS"
git check-ignore -- *
echo "EOS"
} >> "${GITHUB_OUTPUT}"

- name: Compare to base ref
working-directory: ${{ steps.set-up-homebrew.outputs.gems-path }}/${{ steps.ruby-abi.outputs.version }}
run: |
git checkout "origin/${GITHUB_BASE_REF}"
rm .homebrew_vendor_version
brew install-bundler-gems --groups=all
if [[ "$(cat .homebrew_vendor_version)" == "${{ steps.gem-info.outputs.vendor-version }}" ]]; then
ignored_gems="${{ steps.gem-info.outputs.ignored }}"
while IFS= read -r gem; do
gem_dir="./gems/${gem}"
[[ -d "${gem_dir}" ]] || continue
exit_code=0
git check-ignore --quiet "${gem_dir}" || exit_code=$?
if (( exit_code != 0 )); then
if (( exit_code == 1 )); then
echo "::error::VENDOR_VERSION needs bumping in utils/gems.rb" >&2
else
echo "::error::git check-ignore failed" >&2
fi
exit "${exit_code}"
fi
done <<< "${ignored_gems}"
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking but might be nice to pull these into dedicated Bash scripts instead at some point.

@MikeMcQuaid MikeMcQuaid merged commit e501853 into Homebrew:master Sep 27, 2023
25 checks passed
@github-actions github-actions bot added the outdated PR was locked due to age label Oct 28, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
outdated PR was locked due to age
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants