Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

latexdiff messes up spaces after periods #269

Closed
jameswhqi opened this issue May 27, 2022 · 1 comment
Closed

latexdiff messes up spaces after periods #269

jameswhqi opened this issue May 27, 2022 · 1 comment

Comments

@jameswhqi
Copy link

latexdiff doesn't seem to distinguish between periods followed by spaces/line breaks (used to end a sentence) and periods not followed by spaces/line breaks (as in "i.e., something"). If these two different kinds of periods are matched between the old file and new file, the spacing of the old material will be messed up (I guess because this is treated by latexdiff as "insignificant differences").


MWE:

one.tex

i.e., something

two.tex

one. two.
three.

latexdiff one.tex two.tex

\DIFdelbegin \DIFdel{i. e.
, something }\DIFdelend \DIFaddbegin \DIFadd{one. two.
three. }\DIFaddend

Expected output:

\DIFdelbegin \DIFdel{i.e., something }\DIFdelend \DIFaddbegin \DIFadd{one. two.
three. }\DIFaddend

latexdiff two.tex one.tex

\DIFdelbegin \DIFdel{one.two.three. }\DIFdelend \DIFaddbegin \DIFadd{i.e., something }\DIFaddend

Expected output:

\DIFdelbegin \DIFdel{one. two.
three. }\DIFdelend \DIFaddbegin \DIFadd{i.e., something }\DIFaddend

latexdiff --version

This is LATEXDIFF 1.3.2 (Algorithm::Diff 1.15 so, Perl v5.34.0)
  (c) 2004-2021 F J Tilmann
@ftilmann
Copy link
Owner

Your assumption on why this is going wrong is correct. It's a somewhat pathological case but I appreciate that it's probably not extremely uncommon. You can force the correct behaviour with --config MINWORDSBLOCK=0 but this will very likely have undesirable effects in longer text. A proper fix is actually not trivial at all, and instead I have hardcoded some common abbreviations (i.e. -- e.g. -- z.B.; the last one occurs in German texts) to be treated atomically. It's kind of ugly but works, and the list is easily extensible in the source code, but currently not configurable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants