Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stray line break #1051

Closed
drboone opened this issue Feb 4, 2020 · 21 comments
Closed

Stray line break #1051

drboone opened this issue Feb 4, 2020 · 21 comments
Labels
bug Existing features not working as expected
Milestone

Comments

@drboone
Copy link

drboone commented Feb 4, 2020

In the attached PDF, weasyprint has inserted a stray line break. Page one, column two, first full paragraph, between "through" and "remittances". I don't think there's anything in the source HTML that should cause this. Invocation was as simple as:

weasyprint -q review_54930.html review_54930.pdf

There are a few errors on each run of weasyprint for any of our inputs. They don't seem to be critical.

/opt/local/lib/python3.7/site-packages/weasyprint/fonts.py:230: UserWarning: FontConfig: No fonts configured. Expect ugly output.
  'FontConfig: No fonts configured. '
/opt/local/lib/python3.7/site-packages/weasyprint/__init__.py:120: FutureWarning: The behavior of this method will change in future versions.  Use specific 'len(elem)' or 'elem is not None' test instead.
  assert result

I will attach the input HTML and CSS as well as the PDF. Github is reliably crashing my chromium today, so it may take several tries.

review_54930_html.txt

@drboone
Copy link
Author

drboone commented Feb 4, 2020

review_54930.pdf

@drboone
Copy link
Author

drboone commented Feb 4, 2020

reviewpdfcss.txt

@liZe
Copy link
Member

liZe commented Feb 11, 2020

Thank you for this bug report!

I can reproduce the error even if I don’t get the same errors as you. I don’t know what’s happening 😢.

@liZe liZe added the feature New feature that should be supported label Feb 11, 2020
@patbakdev
Copy link

I am also encountering this issue. I tried changing/disabling the hyphenation and justification in the css but it made no difference. I downgraded all the way back to version 43 and still had the issue.

Not sure how this project does things, but shouldn't this be a bug and not a feature (label)

@liZe
Copy link
Member

liZe commented May 4, 2020

Not sure how this project does things, but shouldn't this be a bug and not a feature (label)

Of course.

@liZe liZe added bug Existing features not working as expected and removed feature New feature that should be supported labels May 4, 2020
@patbakdev
Copy link

Thanks. I managed to work-around the issue for now by shortening the sentence (turned an EM DASH into en EN DASH) and turning off hyphens. If I had to guess it may have something to do with the calculating the length of the line (accounting for margins, justification, and hyphan/breaks., etc)

@quis
Copy link

quis commented May 22, 2020

Hey, think we’re seeing the same problem with the heading (Someone needs to confirm your identity) in this letter:

Expected (Chrome print preview) Actual (Weasyprint)
image image

It only happens with that particular string of text. Adding extra characters (eg identity -> identities) renders as expected.

Here’s the HTML used for the above examples: https://gist.github.com/quis/289765c600ea9bd64d85fd41c54fd9dd

Here are the dependencies we’re using (including WeasyPrint 51):
https://github.com/alphagov/notifications-template-preview/blob/c9c6271aa09fc6b2d271604d66939f72e76250ad/requirements.txt

@Tontyna
Copy link
Contributor

Tontyna commented May 22, 2020

WOW, that's a strange one. Fine-tuning of available width, font, spacing and white-space required:

<style>
  .separate-spacing {
    word-spacing: -0.04em;
    letter-spacing: 0.01em;
  }
  .content {
    font-family: 'Nimbus Sans L';
    font-size: 20pt;
    font-weight: bold;
    width: 137.5mm;
    margin: 1em 0;
    background-color: #eee;
  }
</style>
<div class="separate-spacing">
  <div class="content">
      Someone needs to confirm your identity
  </div>
  <div class="content">
      Someone needs to confirm your identity</div>
</div>

The first <div> gets a strange line-break, the second <div> (mark the position of its </div>) just fits:

stray1

When I change the css and apply the *-spacing properties to the .content selector, like this:

<style>
  .content {
    word-spacing: -0.04em;
    letter-spacing: 0.01em;
    font-family: 'Nimbus Sans L';
    font-size: 20pt;
    font-weight: bold;
    width: 137.5mm;
    margin: 1em 0;
    background-color: #eee;
  }
</style>

... the first <div> recovers:

stray2

@liZe
Copy link
Member

liZe commented May 23, 2020

Well… There’s no way I can reproduce this error anymore. I’ve tried your 3 samples, I’ve tried to finely tune @Tontyna’s example, it just works for me.

I am also encountering this issue. I tried changing/disabling the hyphenation and justification in the css but it made no difference. I downgraded all the way back to version 43 and still had the issue.

Having more than one person reporting / reproducing this old issue only since February makes me think of a bug in a third-party library. It’s of course hard to find which one… Pango and Cairo latest versions are pretty old, it may be another one.

@liZe
Copy link
Member

liZe commented May 23, 2020

Well… There’s no way I can reproduce this error anymore. I’ve tried your 3 samples, I’ve tried to finely tune @Tontyna’s example, it just works for me.

Oh, I’ve finally reproduced the first problem with Noto installed. Time to debug :p.

@liZe
Copy link
Member

liZe commented May 23, 2020

The problem comes from the fact that a Pango layout created with a max width may generate lines longer than this max width, even if these lines can be split.

Why?

@liZe
Copy link
Member

liZe commented May 23, 2020

Why?

Because in recent versions of Pango, the width of lines in an approximation, you can’t rely on it. The "bug" is not there in 1.42.x.

@liZe
Copy link
Member

liZe commented May 23, 2020

There’s no way I can find to know whether a layout line is longer than the layout width or not. pango_layout_line_get_extents is just different from the line width internally used to split lines.

The only solution we have left is to split the layout again with a 0 width. If the line is smaller, it means that the original line actually fits (even if its width is wider than max_width) and that we should keep it instead of keeping only its first word.

There’s of course no way to test this reliably.

As soon as we get rid of Cairo, next step is Pango.

PS: This rounding problem is bad for us, but it’s even worse for a lot of people: https://gitlab.gnome.org/GNOME/pango/-/issues/404.

@liZe liZe closed this as completed in bea6cef May 23, 2020
@liZe
Copy link
Member

liZe commented May 23, 2020

As soon as we get rid of Cairo, next step is Pango.

@Tontyna Yes, it means that I’ll rewrite text.py. 😉

@liZe liZe added this to the 52 milestone May 23, 2020
@drboone
Copy link
Author

drboone commented May 23, 2020

In "for what it's worth", the system where I'm seeing this currently has cairo 1.16 and pango 1.44.7. I started seeing this in early January, iirc.

@liZe
Copy link
Member

liZe commented May 23, 2020

In "for what it's worth", the system where I'm seeing this currently has cairo 1.16 and pango 1.44.7. I started seeing this in early January, iirc.

Thank you. The problem appeared in Pango 1.44.x, released in late July 2019, but often included in distributions much later as it introduced a lot of changes.

@drboone
Copy link
Author

drboone commented May 23, 2020

You know, this triggered a faint memory. I think a lot of projects had problems with pango when it switched from returning floats to ints. Quick googling suggests there might be some useful bug reports out there.

@Tontyna
Copy link
Contributor

Tontyna commented May 24, 2020

Nevertheless I don't expect differences when the closing </div> is on a new line.
Nevertheless I don't expect differences when the word-spacing and the letter-spacing isn't inherited.

@Tontyna
Copy link
Contributor

Tontyna commented May 24, 2020

Not to mention the strange position where WeasyPrint breaks the layout line...

@liZe
Copy link
Member

liZe commented May 24, 2020

Not to mention the strange position where WeasyPrint breaks the layout line...

Other browsers break the line here too, I think Pango’s right. Pango knows the Unicode spec better than we do 😉.

Nevertheless I don't expect differences when the closing </div> is on a new line.

The current algorithm takes care of this trailing space, but the wrong assumption fixed in bea6cef broke it. There shouldn’t be a difference now.

Nevertheless I don't expect differences when the word-spacing and the letter-spacing isn't inherited.

It’s for sure a bug in WP, and it’s not fixed. These properties are applied really naively, just to work in many common cases, but it shouldn’t be hard to find many bugs because of them.

@Tontyna
Copy link
Contributor

Tontyna commented May 24, 2020

Only now had a look at bea6cef and the GNOME Pango issue, OMG! Cant believe it!

Good to know about you never joking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Existing features not working as expected
Projects
None yet
Development

No branches or pull requests

5 participants