Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to 0.12.5 #49

Merged
merged 18 commits into from
Oct 31, 2019
Merged

Update to 0.12.5 #49

merged 18 commits into from
Oct 31, 2019

Conversation

joergschiller
Copy link
Contributor

@joergschiller joergschiller commented Apr 8, 2019

Hello!

In reference to #46 I would like to propose this PR to update the wkhtmltopdf binaries to 0.12.5.

I extracted the binaries from https://wkhtmltopdf.org/downloads.html for:

  • Ubuntu 18.04 (bionic) amd64 and i386
  • Ubuntu 16.04 (xenial) amd64 and i386
  • macOS Cocoa 64-bit and Carbon 32-bit

The wrapper script was updated as well to support the different Ubuntu releases.

It still should only work on Ubuntu as it calls lsb_release to retrieve the codename. It was mentioned that the last version also only works on Ubuntu-based systems.

0.12.4 had the same issue, and we released a "good enough" version that worked on at least Ubuntu-based linux systems based on the binaries provided, and I think that's what we are looking for here.

I tested it (very) shortly on my Macbook (macOS Cocoa 64-bit) and a Ubuntu 16.04 (xenial) amd64 machine. But I thought this PR would be a good starting point to discuss the changes.

Thanks alot for your work on this Gem so far!

Also separate 32/64bit version on macos and use more specific filenames.
Uses `lsb_release` to decide which Ubuntu version is in use.
Copy link

@webaholik webaholik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution isn't viable, there are a bunch of users on Centos and other flavors of Ubuntu (some of which may have an issue with the xenial or bionic binaries).

Would be nice to exclude the binary packages and pull in only the required binary, on demand, via a config file. Eventually, this could possibly be automated.

bin/wkhtmltopdf Outdated Show resolved Hide resolved
bin/wkhtmltopdf Show resolved Hide resolved
bin/wkhtmltopdf Outdated Show resolved Hide resolved
@joergschiller
Copy link
Contributor Author

joergschiller commented Apr 15, 2019

@webaholik thanks for your hints so far. If you have access to a CentOS 7 and 6 machine you could do me a favor and check the content of /etc/os-release.

E.g. on Ubuntu it looks like:

NAME="Ubuntu"
VERSION="16.04.5 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.5 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial

Add some example command on how to extract binaries for the different systems.
@joergschiller
Copy link
Contributor Author

@webaholik I've added the binaries for CentOS. Maybe you can try this out now?

@webaholik
Copy link

webaholik commented Apr 15, 2019

@webaholik thanks for your hints so far. If you have access to a CentOS 7 and 6 machine you could do me a favor and check the content of /etc/os-release.

Centos 6 does not have /etc/os-release. Centos 6 and higer does has: /etc/centos-release- so you may be better off do something along the lines of /etc/*os-release instead.

CentOS 6:

# cat /etc/os-release
cat: /etc/os-release: No such file or directory
# cat /etc/*os-release
CentOS release 6.10 (Final)

CentOS 7:

# cat /etc/centos-release
CentOS Linux release 7.4.1708 (Core)

and apparently /etc/os-release is available for CentOS 7:

# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Debian 9:

# cat /etc/*os-release
PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

@webaholik
Copy link

webaholik commented Apr 15, 2019

@joergschiller, feel free to improve, but I've tested this in Debian 9, CentOS 7 & CentOS 6:

    if File.exist?('/etc/os-release')
      os = `. /etc/os-release && echo ${ID}_${VERSION_ID}`.strip
    elsif File.exist?('/etc/centos-release') && centos = File.read('/etc/centos-release')
      # For CentOS < 7
      version_id = centos.scan(/\d+/).first
      os = "centos_#{version_id}"
    end

bin/wkhtmltopdf Outdated Show resolved Hide resolved
@joergschiller
Copy link
Contributor Author

@unixmonkey I've added Dockerfiles for the linux based distros to ease testing. There is also a rudimentary minitest in test/test_with_docker.rb. Hope that isn't too much change but it made testing much more easier for me (beside the help of @webaholik).

@joergschiller
Copy link
Contributor Author

joergschiller commented Apr 16, 2019

@unixmonkey I'll try to have a look at the binary compression later. But I know that GitHub disallows files larger than 100 MB (https://help.github.com/en/articles/working-with-large-files).

Usually one manages bigger files with https://git-lfs.github.com/ - I used it before and it works very well. Can I add it here?

Expectation won't match otherwise (as output includes all build info).
@joergschiller
Copy link
Contributor Author

joergschiller commented Apr 17, 2019

Usually one manages bigger files with https://git-lfs.github.com/ - I used it before and it works very well. Can I add it here?

I tried to move the binaries to Git LFS but GitHub doesn't allow adding LFS objects on forks until the original repo has at least one LFS object (see git-lfs/git-lfs#1449 (comment)).

I've made a copy of the repo for testing and added Git LFS to it:
https://github.com/zinsbaustein/wkhtmltopdf_binary_gem_with_git_lfs/tree/update_to_0.12.5

Maybe @unixmonkey you can have a look around if this fine for you and I'll try to provide you with the necessary steps to 'introduce' Git LFS - if you're not already aware of that ;-)

It makes things easier, e.g. we can't see the changes of this PR here because the diff is too large. And files larger than 100 MB are forbidden, which prevents compressing all the binaries.

@joergschiller
Copy link
Contributor Author

I really appreciate your work and understand that there are a lot more things for you to do. Just want to see if there is something I can do to bring this forward?

@unixmonkey
Copy link
Collaborator

unixmonkey commented May 7, 2019

@joergschiller Are some of the binaries bigger than 100mb compressed? The OSX uncompressed binary is 46M.

Doesn't GitHub's file size limit mean that we'd have to host the LFS files outside of GitHub?

This post states that Rubygems itself has a 500mb limit on gems.

I'm not sure what to do here, perhaps we could make this an umbrella repo, with dependencies on other (new) repos for certain platforms or architectures to split things up, but managing all that "extra" is pretty unappealing too. Do you have any thoughts, @zakird ?

@joergschiller
Copy link
Contributor Author

Good point! Each file is well below the per file limit of Github even without compression:

34M Apr 15 20:06 wkhtmltopdf_centos_6_amd64
35M Apr 15 20:05 wkhtmltopdf_centos_6_i386
36M Apr 15 20:07 wkhtmltopdf_centos_7_amd64
37M Apr 15 20:06 wkhtmltopdf_centos_7_i386
37M Apr 15 20:23 wkhtmltopdf_debian_8_amd64
39M Apr 15 20:23 wkhtmltopdf_debian_8_i386
42M Apr 15 20:24 wkhtmltopdf_debian_9_amd64
42M Apr 15 20:24 wkhtmltopdf_debian_9_i386
46M Apr  8 12:43 wkhtmltopdf_macos_carbon
49M Apr  8 12:43 wkhtmltopdf_macos_cocoa
36M Apr 15 19:19 wkhtmltopdf_ubuntu_14.04_amd64
37M Apr 15 19:18 wkhtmltopdf_ubuntu_14.04_i386
38M Apr  8 12:43 wkhtmltopdf_ubuntu_16.04_amd64
40M Apr  8 12:43 wkhtmltopdf_ubuntu_16.04_i386
45M Apr  8 12:43 wkhtmltopdf_ubuntu_18.04_amd64
44M Apr  8 12:43 wkhtmltopdf_ubuntu_18.04_i386

Compressing all files individually (with bzip -9) results in a total size of all binaries of roughly 224M. So this should work for Github and Rubygems as well.

Don't know why I was so focused on compressing them into one big tar file...

@joergschiller
Copy link
Contributor Author

joergschiller commented May 7, 2019

In terms of avoiding a fat binary gem I like the way webdrivers does it. It downloads the required binary during runtime/first use: https://github.com/titusfortner/webdrivers/blob/master/lib/webdrivers/common.rb#L89

@longkt90
Copy link

longkt90 commented May 13, 2019

hi @joergschiller, thanks for the hard work, especially the testing part with docker and README file are
super helpful.

With my macOS I experienced this error while trying to make a PDF, (wkhtmltopdf --version is working and return wkhtmltopdf 0.12.5 (with patched qt) though - because the file with the same name is already there I think)

Command Error: bzip2: Input file wkhtmltopdf_binary_gem/bin/wkhtmltopdf_macos_cocoa.bz2 has 1 other link. adding a force option -f works:

system("bzip2 --decompress --keep #{binary}.bz2 -f")

maybe we should add this option to the gem too

@joergschiller
Copy link
Contributor Author

joergschiller commented May 13, 2019

Thanks for trying it out @longkt90 !

I can reproduce the issue with a hardlink:

mv wkhtmltopdf_macos_cocoa.bz2 wkhtmltopdf_macos_cocoa_testinglink.bz2
ln wkhtmltopdf_macos_cocoa_testinglink.bz2 wkhtmltopdf_macos_cocoa.bz2

When running ./wkhtmltopdf --version I also get 'Input file ./wkhtmltopdf_macos_cocoa.bz2 has 1 other link'. I just don't understand whats creating a hard link in the first place.

It's just a guess: but running bunzip2 --keep on the hard link also produces this error even though it shouldn't change the source file. I found a very old bug report about it.

I'm not sure if we should override this with --force. Maybe it's a good idea to extract the archive with e.g. bzcat wkhtmltopdf_macos_cocoa.bz2 > wkhtmltopdf_macos_cocoa to avoid touching the source file in any way.

Or it is somehow a race condition. The script extracts the binary on first run. If it is executed multiple times in parallel with no extracted binary ready it could be in a strange state.

I'll try to test and think about this ;-)

There was an error on macOS when a hard link to the archive exists (could be Time Machine). Even with `--keep` it states: 'Input file has 1 other link'. Using `--force` fixes this.
@joergschiller
Copy link
Contributor Author

joergschiller commented May 14, 2019

@longkt90 I've added --force as you suggested - thanks!

4e225b9

I'm not sure but I guess Time Machine created a hard link to the archive. bzip2 --decompress --keep looks if there is a hard link 'pointing to' the archive and complains about this with the error message.

@sshaw
Copy link

sshaw commented May 25, 2019

Thanks for the great work here 👍. I have a suggestion –or four:

  1. You can reduce the number of binaries bundled by creating platform-specific gems via RubyGems platform option
    1. This can be further divided into per-distro gems
  2. With the widespread use of containers, it's likely that one will not have bzip2 installed. Maybe Ruby's zlib library can be used instead? Don't think compression is as good a bzip2, but it's worth considering
  3. RE downloads, maybe you can use http://bintray.com/ or https://github.com/features/package-registry. One thing to note here is that companies have been know to use their local registry for gems, etc... and may not have internet access

Other thoughts:

Make the wkhtmltopdf-binary contain a script that one can run to install the right gem version
(of course this can cause problems for those that have loose gem dependencies)

@joergschiller
Copy link
Contributor Author

joergschiller commented May 27, 2019

Hi Skye! Thanks for your feedback.

  1. You can reduce the number of binaries bundled by creating platform-specific gems via RubyGems platform option

Sounds like a good and standard way to split this gem into smaller parts. I've not seen an example how to to this (but also haven't looked alot yet). Maybe you have more information about how to do this?

  1. With the widespread use of containers, it's likely that one will not have bzip2 installed. Maybe Ruby's zlib library can be used instead? Don't think compression is as good a bzip2, but it's worth considering

Sounds reasonable. I thought bzip2 is widespread but also needed to install it in some of the testing containers as well. I'll switch to gzip and deflate within ruby, the compression ratio is good as well:

35981168 wkhtmltopdf_centos_6_amd64
12702738 wkhtmltopdf_centos_6_amd64.bz2
14130125 wkhtmltopdf_centos_6_amd64.gz

@mackermedia
Copy link

mackermedia commented Jul 9, 2019

I wonder how many other non-centos users are 💔 by the delay in supporting the latest version vs. making sure everything is spick-and-span before releasing an update...?

@webaholik
Copy link

webaholik commented Jul 9, 2019

@joergschiller, @unixmonkey - I was summoned by a troll. What else is required on this one to get it to the finish line, anything I can help with?

@mackermedia
Copy link

<3 <3 <3 🤗 <3 <3 <3

Copy link

@rhuanbarreto rhuanbarreto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is fine! The owner should merge it!

@crstamps2
Copy link

Does this bundle all of the possible bin images into the gem, thereby making the gem very VERY large?

@crstamps2
Copy link

Ok yeah, built a branch using this PR and in the words of Trump, this gem is YUUUUGE!

I was able to install this fork to my machine (using the gem specific_install, unpack the gem, vendor it, but delete all of the binaries I didn't need, and push the whole app to the server. We build on macs, but deploy to CloudFoundry, and the ruby_buildpack is ubuntu based. Not ideal, but works for now.

@joergschiller
Copy link
Contributor Author

@joergschiller, @unixmonkey - I was summoned by a troll. What else is required on this one to get it to the finish line, anything I can help with?

From my point of view this is ready. It will produce a rather fat gem at the moment but one may keep the work to further break this down to another PR.

@joergschiller
Copy link
Contributor Author

Does this bundle all of the possible bin images into the gem, thereby making the gem very VERY large?

Yes definitely this gem is very large, all the binaries alone are slightly over 200 MB. @sshaw suggested to create platform-specific gems via RubyGems platform option (#49 (comment)).

For the time being I focused on getting the update ready. I was not able to look into splitting it up yet. Maybe you can help?

@sshaw
Copy link

sshaw commented Aug 12, 2019

@joergschiller what do you need me to do, write some code?

We need to generate and build 4 gemspecs. 3 of them will assign platform specific values:

  • x86_64-linux
  • amd64-linux
  • x86_64-darwin18 - though not sure if we can just say x86_64-darwin?

And assign the appropriate binary to files and executable. Probably best to copy them to a file name wkhtmltopdf and just assign to platform.

@k3rni
Copy link

k3rni commented Aug 16, 2019

Compressing with bzip/zlib is fine in transport, but we can probably do one better and ship upx-packed executables, which don't need an extra decompression step to work. This is the idea of #47 which seems stalled.

@sshaw
Copy link

sshaw commented Aug 24, 2019

So @k3rni, with your PR the size of your gem is 45MB with all executables? If so, we should go with that and move on with our lives. Well, assuming there's someone out there in this word that can merge! 🗺🤓

@joergschiller
Copy link
Contributor Author

Sorry for not coming back to this in the last weeks. We recently switched to grover (uses Puppeteer) because it supports newer features because of Chromium instead of Qt WebKit.

Feel free to merge this or use it otherwise or close this PR.

@unixmonkey unixmonkey merged commit 01a0733 into zakird:master Oct 31, 2019
@unixmonkey
Copy link
Collaborator

@joergschiller This is now released. Thank you for your help.

@pedrofurtado
Copy link
Contributor

@unixmonkey @joergschiller Is it possible to add CentOS 8?

@unixmonkey
Copy link
Collaborator

@pedrofurtado There is no official binary for CentOS 8 listed here:
https://wkhtmltopdf.org/downloads.html

Does the binary for CentOS 7 work on CentOS 8? If so, a small adjustment to bin/wkthmltopdf should help. Please open a separate issue or PR for this.

@pedrofurtado
Copy link
Contributor

pedrofurtado commented Nov 7, 2019

@breisig
Copy link

breisig commented Nov 14, 2019

Need one for CentOS 8 badly.

@barrywoolgar
Copy link

Need one for CentOS 8 badly.

@breisig They've said above that this could be an easy fix, but you'll need to help them out by testing the CentOS 7 binary on your system. If it works then you could even do the PR yourself by the looks of things.

@pedrofurtado There is no official binary for CentOS 8 listed here:
https://wkhtmltopdf.org/downloads.html

Does the binary for CentOS 7 work on CentOS 8? If so, a small adjustment to bin/wkthmltopdf should help. Please open a separate issue or PR for this.

tioteath pushed a commit to magnet-team/wkhtmltopdf-binary that referenced this pull request Jul 28, 2020
parallel588 pushed a commit to parallel588/wkhtmltopdf_binary_gem that referenced this pull request May 20, 2021
parallel588 pushed a commit to parallel588/wkhtmltopdf_binary_gem that referenced this pull request May 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.