Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] using Weasyprint to generate pdf #254

Closed
legalsylvain opened this issue Nov 14, 2018 · 38 comments
Closed

[RFC] using Weasyprint to generate pdf #254

legalsylvain opened this issue Nov 14, 2018 · 38 comments
Labels
stale PR/Issue without recent activity, it'll be soon closed automatically.

Comments

@legalsylvain
Copy link
Contributor

Hello,

This issue is here to talk about the opportunity to integrate Weasyprint into odoo. feel free to ask question.

Reference :

CC : @liZe, alias guillaume Ayoub, CEO of Kozea that has developed the librairy.

@hbrunn
Copy link
Member

hbrunn commented Nov 14, 2018

yes, writing report_qweb_weasyprint_renderer is on my list for a while already. You can just take #159 and replace chrome by weasyprint.

This will get us beautiful reports with full page backgrounds and proper page margin boxes, amongst others. What I think will be a little bit of a challenge is to add some compatibility layer so that we can actually replace wkhtmltopdf with it. But I think for starters the flag (or maybe a second selection field renderer so that modules can just add to that) as I did in the chrome version will be fine, then we can already use it and add some extra module coming with the proper templates to really replace wkhtmltopdf

@alexis-via
Copy link
Contributor

I see in the documentation that weasyprint is able to produce PDF with attachment, which is really cool (cf https://weasyprint.readthedocs.io/en/latest/api.html) ! But it doesn't seem to have an option to produce PDF/A (which would make it the perfect tool to generate compliant Factur-X invoices !)... i didn't see anything about it in the documentation. @liZe do you confirm that ?

@yajo
Copy link
Member

yajo commented Nov 15, 2018

The docs say:

It is based on various libraries but not on a full rendering engine like WebKit or Gecko.

IMHO it should be better then to try to use a full rendering engine, one that matches a good browser. That should lead to least surprises. Odoo v12+ uses chrome/chromium to run JS tests. Maybe we can use it to render PDFs. Of course this is a general idea, and I have not hacked on it.

Also I'd appreciate it if you open the proposal for Odoo master, since this is a general problem. It can be backported then. They claim to be open to suggestions: odoo/odoo#21255 (comment)

@hbrunn
Copy link
Member

hbrunn commented Nov 15, 2018

check out the PR l inked above, this is a working addon which uses chrome for the rendering.

The thing is when you read the mailinglists and bug trackers of chrome and firefox, most people don't seem very interested in paged media stuff within browsers. That makes sense to me, as a nice printout isn't exactly a killer feature for a browser. So I personally wouldn't hold my breath until any of the above implements CSS3 page margins and counters for example. Just going ahead and contributing it to one of the projects will be weeks or months of full time work I assume, and that doesn't include getting familiar with the project in the first place.

@yajo
Copy link
Member

yajo commented Nov 15, 2018

Hmm I understand... Yes, that makes sense, thanks!

@rco-odoo
Copy link

FYI we have considered using Weasyprint to render reports in Odoo 12.0.

The issue was that, in its current state, Weasyprint does not scale well: rendering very large documents simply blows up memory. That is because all the rendering is done in pure Python. Nice for readability, but not for performance...

@yajo
Copy link
Member

yajo commented Nov 15, 2018

Have you considered a different engine then?

@liZe
Copy link

liZe commented Nov 15, 2018

Hello everyone.

I'm the current lead developer of WeasyPrint, and of course I would be really happy to see Odoo use WeasyPrint to render PDF files.

As it has been said in previous comments, WeasyPrint is a rendering engine dedicated to print, thus supporting interesting CSS features that are not available in browsers like:

  • page margins,
  • cross references,
  • different page formats and layouts in the same document,
  • running headers,
  • printer's marks and bleeds,
  • advanced typography features…

These features are really important if you want to produce high quality documents, and not just "enhanced screenshots" from common browsers. I'm convinced that really supporting paged media can open great opportunities, not only to generate invoices and reports, but also badges, tickets, business cards, posters, etc.

Of course, WeasyPrint comes with its downsides too.

The first one is the lack of some CSS features (like calc() or display: grid for example) and the lack of JS support. Following the specs needs a lot of time, but we've got a pretty decent CSS support already!

The second one is the speed and memory consumption. As said before, WeasyPrint is slow compared to browsers and eats a lot of memory for very large documents. But contrary to what @rco-odoo said (and what most people believe, no offense 😉), the main reason for that is WeasyPrint's pretty simple (stupid?) algorithms, not Python. Of course, we'll never reach Chrome's or Firefox' performances using Python, but we can get large improvements just by avoiding useless code executions and freeing memory at the right moment. Speed and memory have largely been secondary concerns for now (compared to shining new CSS features that everybody want), so it means that we can greatly improve that point with some work. And having a small amount of code (about 14k lines of Python) with a lot of tests is good news for that.

Now that 8-year-old WeasyPrint has reached its first stable release, we have started a lot of actions to understand the real needs of our users and spend more time on what really counts for professional use. We're contacting users to get testimonials and understand their problems, we've started a campaign with bounties and paid support, we're creating samples to help people to discover CSS features and see how useful they can be… So, we want to spend more time developing WeasyPrint, it looks like the perfect moment to improve it for Odoo users' needs.

But I think for starters the flag (or maybe a second selection field renderer so that modules can just add to that) as I did in the chrome version will be fine, then we can already use it and add some extra module coming with the proper templates to really replace wkhtmltopdf

I don't know how Odoo works (yet!), but allowing multiple PDF renderers and let users choose the one they want seems to be a good idea.

Have you considered a different engine then?

Prince, for example, but it's not open source and quite expensive. Of course, I'd prefer spending money on developing WeasyPrint 😄, but it's great software to be honest.

I see in the documentation that weasyprint is able to produce PDF with attachment, which is really cool

It is!

But it doesn't seem to have an option to produce PDF/A (which would make it the perfect tool to generate compliant Factur-X invoices !)

Supporting PDF/A seems to be much easier than PDF/X, and of course Factur-X would be really useful. (See Kozea/WeasyPrint#630 and Akretion's factur-x.)

@yajo
Copy link
Member

yajo commented Nov 15, 2018

the main reason for that is WeasyPrint's pretty simple (stupid?) algorithms, not Python

Well, I thought this at first, because otherwise we should refactor Odoo into C to get it fast, but that's not gonna be the case (I hope! 😆).

I'd love to be able to hack into the PDF renderer using just Python. Currently wkhtmltopdf is like a blackbox to me (and many Odoo devs I guess).

Maybe @rco-odoo can provide some samples of reports that worked bad with Weasyprint?

hbrunn added a commit to hbrunn/reporting-engine that referenced this issue Nov 17, 2018
@hbrunn
Copy link
Member

hbrunn commented Nov 17, 2018

I pushed a working proof of concept to #256

But we're quite far away from being able to use this as dropin replacement for wkhtmltopdf:

hbrunn added a commit to hbrunn/reporting-engine that referenced this issue Nov 18, 2018
hbrunn added a commit to hbrunn/reporting-engine that referenced this issue Nov 18, 2018
hbrunn added a commit to hbrunn/reporting-engine that referenced this issue Nov 18, 2018
@liZe
Copy link

liZe commented Nov 19, 2018

I pushed a working proof of concept to #256

That was fast!

box-decoration-break is not supported, but it should be quite easy to add. I'll create a PR this week for this feature, with clean commits and a lot of comments so that people interested in adding other features can learn the needed common steps.

Another problem related to this feature is split boxes not taking the whole page (see this test where purple borders should reach the bottom of each page).

  • this because I assume implementing position: running() and content: element() seem not really entirely specified

It's a working draft, but so are many features implemented in WeasyPrint. And named strings are implemented to solve some related cases. So… Why not!

(Why is that related to box-decoration-break?)

@hbrunn
Copy link
Member

hbrunn commented Nov 19, 2018

when you have box-decoration-break, you can give your html (or body) element some margins, and place position: fixed element within these margins. So basically simulate margin boxes without margin boxes.
It's simply my assumption this is much easier to implement in weasyprint than content: element(). And useful anyways, so no work wasted if the above is implemented anyways.
Sure about the working draft, a lot of CSS3 is, but Issue1 in the specs you linked is what puts me off running ahead and implementing this immediately. For the implementation it probably makes quite a difference if we copy or move elements, so I personally would rather wait for this discussion to be settled before investing serious amounts of time there.

@liZe
Copy link

liZe commented May 14, 2019

For the record: box-decoration-break support has been added (see Kozea/WeasyPrint#771).

@bhaveshselarka
Copy link

@hbrunn @liZe @yajo What will be alternative of wkhtmltopdf as its having lots off issues. I am currently using odoo version 11 community edition.

Are there any replacement modules available?

@hbrunn
Copy link
Member

hbrunn commented May 31, 2019

look at #256 (that's also linked a bit up this thread) - contributions are welcome

@hbrunn
Copy link
Member

hbrunn commented Jun 10, 2019

@liZe thanks for the heads up, this nearly works! I've problems with assigning a margin to the body element with cloning the box decoration (the top margin only exists on the first page there), but padding works.
Then I use position: fixed elements for footer and header, but here CSS counters and strings don't work as expected - I'll try to provide a patch for the library some time in the future.

@liZe
Copy link

liZe commented Jun 10, 2019

I'll try to provide a patch for the library some time in the future.

You can post your HTML and CSS samples with a short explanation about what you expect, I can take a look.

@bhaveshselarka
Copy link

Yes, @liZe, @hbrunn I have also faced the same issues, it's not taking properties from HTML to PDF.

@hbrunn
Copy link
Member

hbrunn commented Jun 11, 2019

@liZe attached a minimal example. The spec about generated content only talks about margin boxes, but I think those would be super useful for position: fixed elements too:

<html>
    <head>
        <style type="text/css">
            @page {
                size: A4;
                margin: 0px;
            }
            .current_heading:after {
                content: string(title);
            }
            .current_page:after {
                content: counter(page);
            }
            .page_count:after {
                content: counter(pages);
            }
            .header {
                text-align: center;
                position: fixed;
                top: 0px;
                left: 0px;
                right: 0px;
                height: 1cm;
                padding: 0.5cm;
                background: blue;
            }
            .footer {
                text-align: center;
                position: fixed;
                bottom: 0px;
                left: 0px;
                right: 0px;
                height: 1cm;
                padding: 0.5cm;
                background: green;
            }
            html {
                margin: 0px;
                padding: 0px;
            }
            body {
                margin: 0px;
                padding: 2cm;
                box-decoration-break: clone;
            }
            h1 {
                string-set: title content();
                page-break-before: always;
            }
        </style>
    </head>
    <body>
        <div class="header">
            I'm a header: <span class="current_heading" />
        </div>
        <section>
            <h1>This should show up in the header on the first page</h1>
            hello<span style="page-break-after: always" />world
            <h1>This should show up in the header on the second page</h1>
        </section>
        <div class="footer">
            Page <span class="current_page"/> of <span class="page_count" />
        </div>
    </body>
</html>

hbrunn added a commit to hbrunn/reporting-engine that referenced this issue Sep 22, 2019
@github-actions
Copy link

github-actions bot commented Mar 6, 2022

There hasn't been any activity on this issue in the past 6 months, so it has been marked as stale and it will be closed automatically if no further activity occurs in the next 30 days.
If you want this issue to never become stale, please ask a PSC member to apply the "no stale" label.

@github-actions github-actions bot added the stale PR/Issue without recent activity, it'll be soon closed automatically. label Mar 6, 2022
legalsylvain pushed a commit to legalsylvain/reporting-engine that referenced this issue Apr 25, 2023
legalsylvain pushed a commit to legalsylvain/reporting-engine that referenced this issue Apr 25, 2023
@len-foss
Copy link

After struggling with wkhtmltopdf to get both good header and footer for an entire day, I gave a spin to the 16.0-ADD-weasyprint-lru-cache-idiot branch to test Weasyprint.
To be frank, I'm amazed at how bad the results were (performance is also a problem, for that matter).
Headers and footers are so broken that it is preferable to not have them at all.
Margins and padding are completely off compared to default rendering.
I did extensive tests and in some cases the results were almost good, but in all cases at least one thing was unacceptable.

With some code tweaking, I also managed to use Chrome headless as a renderer (I could make a branch for others to test) but the results were entirely broken too (although general layout was OK...)

I also tested Prince, and to be fair although the results were somewhat better they were also not acceptable.
On the quotations, the footer always ended up on another page. The margins/padding are also very wide so that is probably the cause.

With respect to padding/margin sizes, wkhtmltopdf is clearly the odd one out, however the default styles have been written for it, so this would be an additional hurdle for a switch (unless someone already has decent tweaks?).

Other approaches I tested are not really suitable with respect to style.
For example, Pandoc mimics somewhat the placement of elements, but it loses entirely the document structure, so it wouldn't be possible to use say a Latex stylesheet to recover a good-looking output.

At this point I think it would make sense to generate headers/footers as separate pdfs, and stitch them together to the main content
(printed on a format say A4 - (height(footer) + height(header)).
It wouldn't be terribly complicated, although it would not be very simple either since they have to be generated per language, and the footer should be with the page number as overlay.

I there's some work on another solution I'd be willing to test/help.

@liZe
Copy link

liZe commented Jun 22, 2023

To be frank, I'm amazed at how bad the results were (performance is also a problem, for that matter).

That’s a proof of concept, not a branch ready to be merged 😄.

About performance, a large amount of time is spent downloading countless fonts that are not used or don’t exist anymore. These fonts are in the default stylesheet. They don’t take that much time using Chrome or wkhtmltopdf because they parallelize
request, and WeasyPrint doesn’t. There’s currently cache for that (as far as I can remember), so that’s much faster after the first rendering, but with a clean stylesheet it’ll be much better.

At this point I think it would make sense to generate headers/footers as separate pdfs, and stitch them together to the main content
(printed on a format say A4 - (height(footer) + height(header)).
It wouldn't be terribly complicated, although it would not be very simple either since they have to be generated per language, and the footer should be with the page number as overlay.

For example WeasyPrint can use running elements for headers and footers.

With respect to padding/margin sizes, wkhtmltopdf is clearly the odd one out, however the default styles have been written for it, so this would be an additional hurdle for a switch (unless someone already has decent tweaks?).

Many features (including page headers and footers) work using CSS with Web2Print solutions like WeasyPrint or Prince, while they were using a dedicated API for wkhtmltopdf, that’s why the rendering is broken depending on the content. I can definitely give hints about that if you want.

In my opinion, no solution will automatically give correct rendering without spending some time transforming the existing stylesheets. But I doubt that there’s something currently done by wkhtmltopdf and that can’t be done with other recent Web2Print tools.

@rvalyi
Copy link
Member

rvalyi commented Jun 22, 2023

Hello, I think we should also be cautious if Odoo SA is going to increase the usage of JavaScript inside reports (weasyprint will never deal with JavaScript). Here is a commit they did in that direction yesterday for instance:
odoo/odoo#125969

Considering the trend is to unify the PDFs reports and the portal presentation (like sign an order in the portal that should look like the pdf order) and considering Odoo SA is betting a lot on its owl stuff, one cannot guarantee JavaScript usage in reports will not increase. IMHO Weasyprint could be an alternative solution for some specific reports (say like py3o/libreoffice is) but hardly a general purpose replacement (I put more faith into the embedded browsers solutions, be it inside a microservice to avoid cold starts).

@liZe
Copy link

liZe commented Jun 23, 2023

I think we should also be cautious if Odoo SA is going to increase the usage of JavaScript inside reports (weasyprint will never deal with JavaScript).

Then it’s probably safer to use a tool based on a browser, even if it’s limited regarding pagination features.

@len-foss
Copy link

Well, thank you very much @liZe for the detailed answer.

I concur that given the constraints and what Raphaël mentioned, it would make more sense that this could only be one rendering engine and not a full drop-in replacement for pdf generation.

Many features (including page headers and footers) work using CSS with Web2Print solutions like WeasyPrint or Prince, while they were using a dedicated API for wkhtmltopdf, that’s why the rendering is broken depending on the content. I can definitely give hints about that if you want.

I'm wondering how much time would have to be invested in tweaking it to get to a working solution; and in that case I'd be really afraid to spend days and never manage to get it working, or that it would be really broken the day the header is one line longer or something is two-pages long instead.

The PoC module contains some tweaking (with a similar one for header):

.footer {
    position: running(footer);
    width: 100%;
}

However, I didn't catch that it contains the comment <!-- this doesn't work yet /-->.
So I tweaked the Css manually for a test, and the result ended up worse.

reportss

Screenshot is left WeasyPrint, middle WeasyPrint with tweaks, right Prince with the tweaks (it has it's fair share of issues).
First document is 2 pages with footer on the second page, middle is one page without footer and cropped header.

The tweaks are directly from https://github.com/legalsylvain/reporting-engine/blob/16.0-ADD-weasyprint-lru-cache-idiot/report_qweb_weasyprint_renderer/static/src/css/report_qweb_weasyprint_renderer_wkhtmltopdf_compat.css
(including the counter, for some reason).

If you have a suggestion for a better css that should work better I'm all for testing it.

@liZe
Copy link

liZe commented Jun 23, 2023

Well, thank you very much @liZe for the detailed answer.

No problem!

If you have a suggestion for a better css that should work better I'm all for testing it.

Here are some quick tips:

  • You can define the height of the header by changing the @page { margin-top } property. It must be defined so that it can handle the whole header, @page { @top-center { vertical-align } } can be used to handle vertical alignment.
  • The @top-center box width is the page’s width without the left and right margins. If you want your header to take the whole page width, setting width: 21cm instead of 100% is what you want.
  • Running elements are only displayed on pages where they’ve already been seen. If you want your footer to be displayed on all pages, you can put it at the beginning of the document (generally just after the header). That’s probably why the footer is not visible in your example.

I don’t remember where the counter comes from, but we managed to remove it during our tests.

If you think that it could be useful to spend some time to get a more solid proposal, we can definitely work with @legalsylvain (who knows Odoo much better than me!) just as we already did to get this PoC. There’s no need for you to spend a lot of (often very frustrating 😄) time trying to fix everything in the stylesheet, I’ll probably be more efficient! It could be useful to have a list of documents with various content to test, so that we’re sure that the PoC fits your needs.

And, of course, no offense if some points are true blockers (such as JS support), there’s no need to work on a proposal that can’t be merged anyways!

@hbrunn
Copy link
Member

hbrunn commented Jul 6, 2023

of course the whole point of this is to not need js-hacks (which is what happens in Odoo with wkhtmltopdf), and just do proper css3 paged media. it's all there (as in specification), we just need to implement it

@legalsylvain legalsylvain reopened this Jul 7, 2023
@pedrobaeza
Copy link
Member

If I don't remember bad, one of the problems of Weasyprint was the performance. Has that been addressed?

@legalsylvain
Copy link
Contributor Author

If I don't remember bad, one of the problems of Weasyprint was the performance. Has that been addressed?

Not exactly right. Well, weasyprint is slow as soon as the html code is totally hugly and bad designed. For exemple, when you generate a html report in a recent Odoo version, there are 50 calls to external ressources. ( I mean on internet). of the 50, 25 return a 404 error...

After some patches in Odoo code, to remove that useless queries, weasyprint is more or less as fast as wkhtmltopdf.

I did a PoC patching odoo master with @liZe during a day, 2 monthes ago. The PoC was "replace wkhtmltopdf by weasyprint" in the core module.

after a little work and some patches :

  • the result is quite ok. (need extra work of course, to refine some things.)
  • speed on little : more or less as fast as wkhtmltopdf.
  • deployment : easier. (just a pip install)
  • implementation : simplier. (with wkhtmltopdf the html is splitted into header / body / footer. It is not required here).
  • huge pdf WITH table : weasyprint fails as wkhtmltopdf. It is not possible to print a huge general ledger. (I don't remember the size of the pdf. maybe 10000 pages ...). according to @liZe the reasons comes from the high quantity of cells of a table. it is possible to print a huge document in weasyprint, but not a huge table. for me, it's an existing problem with wkhtmltopdf, and it's not a problem in itself. printing a large 10,000-page book in pdf doesn't make sense. The export type should be CSV / Excel...

At this step, two ways :
A) motivate Odoo SA to replace wkhtmltopdf with weasyprint. For me that's the solution, because there is no regression, and some improvments. (at least : deployment, and project maintained).
B) if Odoo SA still use wkhtmltopdf (or replace it by another tool), OCA can maintains report_weasyprint in reporting-engine. (but it's less interessant and it requires tedious work)

The ball is in @bouvyd's court, whom I contacted a few months ago.
If people are interested in the subject, there could be a workshop at Odoo Experiences with the Odoo development team.

@cmal-odoo
Copy link

Hello everyone, I'm an intern working for Odoo Buffalo. I made a comment earlier but deleted as I felt I was still a little under-informed on this topic.

I've been tasked with replacing wkhtmltopdf with WeasyPrint as an internship project. I don't think the organizers initially realized the scope of the task, but told me I was free to continue working on it if I so desired. I'm still a novice dev (one month experience working at Odoo and still working on my CS degree) and it seems like a pretty daunting task, but from what I've gathered this could be a really valuable improvement to Odoo's software.

Suffice to say, I have the opportunity to devote 100% of my time at work to develop this, every day until my internship ends mid-August, and potentially beyond if I end up working for the company after graduation. I just started and right now I'm still working on fully understanding how wkhtmltopdf is integrated into Odoo and studying the proof-of-concepts already developed.

Do you folks think it would be a bad idea for an inexperienced dev to take this on, or should I go for it? I'd appreciate any guidance on how I can best spend my time and contribute to this development. I don't want to waste time trying to write code other people have already written. Thanks.

@len-foss
Copy link

len-foss commented Jul 7, 2023

I've been able to run Sylvain's branch with barely any modification:
16.0...legalsylvain:reporting-engine:16.0-ADD-weasyprint-lru-cache-idiot

Basically all the basic integration work is done in it, so if you're able to understand that module code then you're good to go.
There's a line that is obviously wrong in it (context["qweb_pdf_engine"] = "weasyprint") since it means you actually can't switch between wk and weasy.

In order to truly use it as a replacement the daunting task is to adapt the CSS and templates to weasyprint and optimize everything, as noted by Sylvain above. It was also noted earlier that running elements should be moved in the template so that the footer will appear on the first page for instance.

@Garcicasti
Copy link

Hey @cmal-odoo

Remember we were all novice once. I think you have a great opportunity to learn and to make an excellent contribution to Odoo. This is not a task which requires you to know everything about Odoo to begin with. Wkhtmltopdf is an obsolete library which requires replacing and, to my understanding, it sounds like Weasyprint (even though it has some limitations of course) is a pretty good candidate.

Just go for it 💪 It won't be easy, but worst thing that can happen is that you are not able to finish all the tweaking required to make all PDF reports to work well with the new library. Or that you arrive to the conclusion that there is a better option.

Study the proof-of-concepts, lean on community and internal Odoo people to help you, and you'll make it.

Courage 😉

@gdgellatly
Copy link
Contributor

If I don't remember bad, one of the problems of Weasyprint was the performance. Has that been addressed?

@pedrobaeza On complex pages performance is ridiculously bad, i.e. compare www.odoo.com on wk vs wp. On more standard docs it is still somewhat slower but perhaps acceptable if no other alternative (There are some benchmarks Odoo did somewhere in issues for both these cases). I think however we do need to do benchmarks with things like 100 page accounting reports for both performance and memory use compared to wk.

@rvalyi
Copy link
Member

rvalyi commented Jul 7, 2023

Hello. While I think it's okay to assume very large ledgers should not be printed as pdf, I think @gdgellatly is right, I think common 100 pages (even 50 if you like) accounting reports should be available, otherwise it will just make the workflow more painful even in simple cases. You'll also find many industry printing many pages of large pickings/manufacturing orders.

@cmal-odoo I suggest you double check with the Belgian R&D team if they are aligned with the no-Javascript report vision and if it's okay considering the future OWL ubiquity or even the need to render some pdf as HTML page in the portal (like sign an order in the portal). Specially I suggest you check with Antony Lesuisse or Gery Debongnie.

Cause for a general purpose report engine there are other options like embedded browsers such as https://pptr.dev/ or Firefox equivalent that can still be made fast if deployed as a micro-service (avoids cold starts, it's already what some do with py3o Libreoffice reports long ago BTW https://github.com/OCA/reporting-engine/tree/14.0/report_py3o_fusion_server ).
Also, Puppeteer does have options top control the page presentation. Native basics: https://pptr.dev/api/puppeteer.pdfoptions https://pptr.dev/api/puppeteer.pdfmargin or more advanced via extensions https://github.com/PejmanNik/puppeteer-report#readme

Again, I think Weasyprint could certainly be a very good report engine option, I just question if it can become the default engine.

Thank you for addressing this important issue.

@legalsylvain
Copy link
Contributor Author

legalsylvain commented Jul 7, 2023

@pedrobaeza On complex pages performance is ridiculously bad, i.e. compare www.odoo.com on wk vs wp.

Indeed, weasyprint has bad performance, on html that is not designed to be printed. and odoo.com is not printable. if you go to odoo.com, it will load 29 ressources. the total size is 10Mb (without images). the size of web.assets_frontend.min.css is 6,4Mb. That is just very bad for a quite simple page.

If you open odoo.com with google chrome, and click print, then export in pdf, you'll see :

  1. it is quite slow (somes seconds for a 7 pages PDF)
  2. the rendered pdf is totally ugly. See : Open Source ERP and CRM _ Odoo.pdf

TLDR : As long as you supply poorly generated html code that's too big because of useless data, and not designed for printing, wkhtmltopdf won't do the trick. You have to look at the problem the other way round: think html code for print.
Extra PoV : the real challenge of our time is not to do better, with more resources, but to do better, with fewer resources. We need to turn to more sober technologies, and eco-design our websites.

@rvalyi
Copy link
Member

rvalyi commented Jul 8, 2023

Hello @legalsylvain I somewhat agree with that. But liking it or not, who decides the roadmap of Odoo so far is Odoo SA, not us. that's why I asked to check if Odoo R&D is aligned with this. Cause if the Odoo roadmap is "wow effects" all over the place to have the same pdf as your portal OWL based page Weasyprint is not going to cut it. That's why I say no doubt we can use Weasyprint as a nice alternative engine, but as for being the default engine, it will really be up to Odoo SA top R&D decisions.

Finally, about sobriety, don't buy that one, cause browser CSS engines will always be orders of magnitude more optimized to render HTML than Python emulated CSS. The amount of engineering put to optimize these engines with native efficient languages with parallelism is simply out of reach, think Google invested all its rivality with Microsoft into exactly this...

@github-actions github-actions bot removed the stale PR/Issue without recent activity, it'll be soon closed automatically. label Jul 9, 2023
@yajo
Copy link
Member

yajo commented Jul 10, 2023

You can also use LibreOffice as an alternative:

➤ time libreoffice --headless --convert-to pdf https://www.odoo.com/
convert  -> /var/home/yajo/Descargas/.pdf using filter : writer_web_pdf_Export

________________________________________________________
Executed in   15.14 secs    fish           external
   usr time    3.33 secs    0.00 micros    3.33 secs
   sys time    0.29 secs  606.00 micros    0.29 secs

It's slower and the PDF is even uglier! 🚀 😆

If you're rendering a local html file it's faster. I downloaded that website with assets and look at it:

➤ time libreoffice --headless --convert-to pdf ./Open\ Source\ ERP\ and\ CRM\ Odoo.html 
convert /var/home/yajo/Descargas/Open Source ERP and CRM Odoo.html -> /var/home/yajo/Descargas/Open Source ERP and CRM Odoo.pdf using filter : writer_web_pdf_Export

________________________________________________________
Executed in  997.29 millis    fish           external
   usr time  830.79 millis    0.00 micros  830.79 millis
   sys time  162.13 millis  679.00 micros  161.45 millis

Open Source ERP and CRM Odoo.pdf

I don't think it's better than weasyprint, but it's worth noting another working and maintained alternative.

Dumb question: why Odoo doesn't fork wkhtmltopdf and keep their fork up to date? After all, the main problem with it is that it's got a big collection of CVE accumulated by the lack of maintenance of its qt-webkit engine. It just needs security maintenance, not big new fancy features.

Probable answer: in Odoo there are many devs, but not much qt, c and webkit knowledge. If that's the answer, then Weasyprint does seem like a nice option. It is python, so it's much easier for any Odoo dev to diagnose and fix any bottlenecks, as the toolkit to do that is the same. Both Odoo and Weasyprint could benefit each other by working together IMHO.

Big question: are you guys going to make sure the reports look the same? If they don't, then I can see thousands of tickets from each customer saying "my invoices look different, what happened?" And that's gonna be PITA.

Copy link

github-actions bot commented Jan 7, 2024

There hasn't been any activity on this issue in the past 6 months, so it has been marked as stale and it will be closed automatically if no further activity occurs in the next 30 days.
If you want this issue to never become stale, please ask a PSC member to apply the "no stale" label.

@github-actions github-actions bot added the stale PR/Issue without recent activity, it'll be soon closed automatically. label Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale PR/Issue without recent activity, it'll be soon closed automatically.
Projects
None yet
Development

No branches or pull requests