Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite recursion with some PDF inputs #57

Closed
jgclark opened this issue Aug 19, 2023 · 27 comments
Closed

Infinite recursion with some PDF inputs #57

jgclark opened this issue Aug 19, 2023 · 27 comments

Comments

@jgclark
Copy link

jgclark commented Aug 19, 2023

Apologies if this is the wrong place to report this.

I have just found your project to replace pdfbook and pdfnup as I'm moving away from LaTeX to Typst.
I was delighted to find it was available via brew, and appeared to install v3.0.9 without issue on my macOS 13.4.1 machine.
However, man pdfbook etc. return "No manual entry for pdfbook". As the README here doesn't include any usage, I was reduced to reading the code to find the args. I'm not sure why this is missing.

psbook works as expected.
But psnup -2 <file> produces lots of errors:

Traceback (most recent call last):
  File "/usr/local/bin/psbook", line 8, in <module>
    sys.exit(psbook())
             ^^^^^^^^
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/psutils/command/psbook.py", line 86, in psbook
    transform.transform_pages(
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/psutils/transformers.py", line 132, in transform_pages
    transform_pages(pagerange, odd, even, reverse)
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/psutils/transformers.py", line 127, in transform_pages
    self.finalize()
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/psutils/transformers.py", line 482, in finalize
    self.writer.write(self.outfile)
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/pypdf/_writer.py", line 1310, in write
    self.write_stream(stream)
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/pypdf/_writer.py", line 1283, in write_stream
    object_positions = self._write_pdf_structure(stream)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/pypdf/_writer.py", line 1325, in _write_pdf_structure
    object_positions.append(stream.tell())
                            ^^^^^^^^^^^^^
OSError: [Errno 29] Illegal seek
EOF marker not found
Traceback (most recent call last):
  File "/usr/local/bin/psnup", line 8, in <module>
    sys.exit(psnup())
             ^^^^^^^
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/psutils/command/psnup.py", line 158, in psnup
    doc = document_reader(infile, file_type)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/psutils/readers.py", line 124, in document_reader
    return constructor(file)
           ^^^^^^^^^^^^^^^^^
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/psutils/readers.py", line 25, in __init__
    super().__init__(stream, strict, password)
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/pypdf/_reader.py", line 318, in __init__
    self.read(stream)
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/pypdf/_reader.py", line 1537, in read
    self._find_eof_marker(stream)
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/pypdf/_reader.py", line 1608, in _find_eof_marker
    line = read_previous_line(stream)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Cellar/psutils/3.0.9/libexec/lib/python3.11/site-packages/pypdf/_utils.py", line 266, in read_previous_line
    raise PdfStreamError(STREAM_TRUNCATED_PREMATURELY)
pypdf.errors.PdfStreamError: Stream has ended unexpectedly

I'm nopsbook and psnup are both in /usr/local/bin/ and are in the PATH.

@jgclark
Copy link
Author

jgclark commented Aug 19, 2023

And I simply can't get pstops to operate like I think it should:

pstops "2:0(0,0),1U(1w,1h)" nup-out.pdf > book.pdf

This gives me pstops: cannot open input file 2:0(0,0),1U(1w,1h) error.
Sorry if this is just basic user error.

@rrthomas
Copy link
Owner

I'm sorry you're having trouble with PSUtils. To take your points in order:

  1. man pages: as far as I know, they are installed properly when the package is installed via pip, so this sounds like a problem with the brew packaging, and you should report it to brew.
  2. I can reproduce this with a PDF file: it seems that pypdf isn't happy writing directly to standard output. I'll look into it. In the mean time, you can work around the problem by always specifying an output file.
  3. pstops now requires -S or --specs before the the page specifications. I knew this was a backwards incompatibility; what I had forgotten is that old PSUtils (the C version) does not even allow an option before the specs argument, instead requiring it (it's a mandatory argument). I shall restore backwards compatibility.

@rrthomas
Copy link
Owner

By the way, a workaround for the missing man pages: the output of --help is much more detailed than it used to be, and the man pages are now largely automatically generated from this output.

@jgclark
Copy link
Author

jgclark commented Aug 19, 2023

Thanks for the very quick response.

  1. Right: is there a separate brew maintainer for your package?
  2. I'd tried specifying an output file too. This gives much longer error, culminating with RecursionError: maximum recursion depth exceeded.
  3. Thanks. Without psnup working I can't get to test this.

I'd tried pnsup -h just not psnup --help ! That does indeed help.

@rrthomas
Copy link
Owner

rrthomas commented Aug 19, 2023

1. Right: is there a separate brew maintainer for your package?

Yes, I only package PSUtils for PyPI.

2. I'd tried specifying an output file too. This gives much longer error, culminating with `RecursionError: maximum recursion depth exceeded`.

I can't reproduce this, so sounds like it's file-specific; please can you supply a failing test case? Or failing that, some of the traceback; presumably it's obvious whether the recursion is in psutils or pypdf?

@rrthomas
Copy link
Owner

I've fixed the two bugs you uncovered in git. If you can test from git (see README.md) I'd be most grateful. In any case, I'll make a new release soon.

@jgclark
Copy link
Author

jgclark commented Aug 20, 2023

OK, I've tried this, and psnup is failing in a different way.

Steps:

  • Failed to build libpaper -- see separate issue on that project
  • In the source directory: python -m build. This reported successful build of 2 projects.
  • PYTHONPATH=. python -m psutils.command.psbook test.pdf testbook.pdf. As before, ran OK producing 12 pages.
  • PYTHONPATH=. python -m psutils.command.psnup -2 testbook.pdf. Failed with new error that recursed. That starts:
[1,2] [3,4] [5,6] [7,8] Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/jonathan/Downloads/PSUtils-test/psutils/psutils/command/psnup.py", line 312, in <module>
    psnup()
  File "/Users/jonathan/Downloads/PSUtils-test/psutils/psutils/command/psnup.py", line 306, in psnup
    transform.transform_pages(
  File "/Users/jonathan/Downloads/PSUtils-test/psutils/psutils/transformers.py", line 133, in transform_pages
    transform_pages(pagerange, odd, even, reverse)
  File "/Users/jonathan/Downloads/PSUtils-test/psutils/psutils/transformers.py", line 122, in transform_pages
    self.write_page(
  File "/Users/jonathan/Downloads/PSUtils-test/psutils/psutils/transformers.py", line 456, in write_page
    outpdf_page.merge_transformed_page(self.reader.pages[real_page], t)
  File "/usr/local/lib/python3.11/site-packages/pypdf/_page.py", line 1370, in merge_transformed_page
    self._merge_page(
  File "/usr/local/lib/python3.11/site-packages/pypdf/_page.py", line 1071, in _merge_page
    return self._merge_page_writer(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/_page.py", line 1224, in _merge_page_writer
    aa = a.clone(
         ^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 115, in clone
    arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_base.py", line 295, in clone
    obj.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 115, in clone
    arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
          Only show lines that have been added/removed/modified.
[1,2] [3,4] [5,6] [7,8] Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/jonathan/Downloads/PSUtils-test/psutils/psutils/command/psnup.py", line 312, in <module>
    psnup()
  File "/Users/jonathan/Downloads/PSUtils-test/psutils/psutils/command/psnup.py", line 306, in psnup
    transform.transform_pages(
  File "/Users/jonathan/Downloads/PSUtils-test/psutils/psutils/transformers.py", line 133, in transform_pages
    transform_pages(pagerange, odd, even, reverse)
  File "/Users/jonathan/Downloads/PSUtils-test/psutils/psutils/transformers.py", line 122, in transform_pages
    self.write_page(
  File "/Users/jonathan/Downloads/PSUtils-test/psutils/psutils/transformers.py", line 456, in write_page
    outpdf_page.merge_transformed_page(self.reader.pages[real_page], t)
  File "/usr/local/lib/python3.11/site-packages/pypdf/_page.py", line 1370, in merge_transformed_page
    self._merge_page(
  File "/usr/local/lib/python3.11/site-packages/pypdf/_page.py", line 1071, in _merge_page
    return self._merge_page_writer(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/_page.py", line 1224, in _merge_page_writer
    aa = a.clone(
         ^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 115, in clone
    arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_base.py", line 295, in clone
    obj.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 115, in clone
    arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 115, in clone
    arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_base.py", line 295, in clone
    obj.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 115, in clone
    arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 115, in clone
    arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_base.py", line 295, in clone
    obj.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 115, in clone
    arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))

My test file, generated by Typst (an emerging replacement for LaTeX) is:
test.pdf

@rrthomas
Copy link
Owner

Thanks for this. Re libpaper, you should be able to install that with brew, no need to install it from source.

I can reproduce the crash with your test file, many thanks. I'll look into it.

@rrthomas
Copy link
Owner

The endless recursion is inside pyPDF, so it looks like a pyPDF bug; I'll see if I can work up a suitable issue report for that project.

@rrthomas
Copy link
Owner

rrthomas commented Aug 20, 2023

A workaround in cases like this: convert your PDF to PostScript with ps2pdf, then do the operations on the PostScript file, then convert the result back to PDF with pdf2ps. Certainly works fine in this case, and in general the PostScript code is much simpler and less likely to suffer this sort of problem. It's also entirely within the PSUtils project, so I can fix it if there are bugs!

@jgclark
Copy link
Author

jgclark commented Aug 20, 2023

[Apologies, I accidentally edited this comment instead of quote-replying. I have tried to restore it as it was.—@rrthomas]

Thanks for this. Re libpaper, you should be able to install that with brew, no need to install it from source.

Confirmed I can install that way. The README doesn't mention the brew option, AFAICS.

A workaround in cases like this: convert your PDF to PostScript with ps2pdf, then do the operations on the PostScript file, then convert the result back to PDF with pdf2ps. Certainly works fine in this case, and in general the PostScript code is much simpler and less likely to suffer this sort of problem. It's also entirely within the PSUtils project, so I can fix it if there are bugs!

Nice.
This works, with the following caveats:

  • all pdf2ps and ps2pdf operations take about 8 secs to run: is that to be expected, or is it a likely indicator of poor PDF generation? Also the file size goes from 73KB (PDF) to 9.5MB (PS) in the first step.
  • the psnup -2 testbook.ps testnup.ps stage is clipping the text from the previous stage. In more detail the 'top edge' is clipped, and as far as I can see its writing more to 'US letter' size than A4, even when I try it with psnup -2 -p a4 testbook.ps testnup.ps or psnup -2 -P a4 testbook.ps testnup.ps.

I confirm the testbook.ps appears to be A4 dimensions.

@rrthomas
Copy link
Owner

rrthomas commented Aug 20, 2023

Confirmed I can install that way. The README doesn't mention the brew option, AFAICS.

Indeed, I can't control/track downstream packages.

all pdf2ps and ps2pdf operations take about 8 secs to run: is that to be expected, or is it a likely indicator of poor PDF generation? Also the file size goes from 73KB (PDF) to 9.5MB (PS) in the first step.

That's a hazard of conversion.

the psnup -2 testbook.ps testnup.ps stage is clipping the text from the previous stage.

You can try setting the page size explicitly: psutils tries to read the page size from the PostScript, but may not succeed.

@rrthomas
Copy link
Owner

rrthomas commented Aug 20, 2023

Another tip: I accidentally discovered that I was using pdftops and pstopdf (sic), from poppler, not pdf2ps and ps2pdf from GhostScript. They seem to work faster, produce smaller files, and preserve the page size!

@jgclark
Copy link
Author

jgclark commented Aug 21, 2023

You can try setting the page size explicitly: psutils tries to read the page size from the PostScript, but may not succeed.

I did, as I showed above: "the psnup -2 testbook.ps testnup.ps stage is clipping the text from the previous stage. In more detail the 'top edge' is clipped, and as far as I can see its writing more to 'US letter' size than A4, even when I try it with psnup -2 -p a4 testbook.ps testnup.ps or psnup -2 -P a4 testbook.ps testnup.ps."

Or is there a different way to "set page size explicitly"?

More to the point, psbook seems to be detecting and keeping A4 size, but psnup seems not to be, from the output of psbook.

I accidentally discovered that I was using pdftops and pstopdf (sic), from poppler, not pdf2ps and ps2pdf from GhostScript. They seem to work faster, produce smaller files, and preserve the page size!

Sorry, I can't tell which you find to be the faster: gs or poppler?

@rrthomas
Copy link
Owner

You can try setting the page size explicitly: psutils tries to read the page size from the PostScript, but may not succeed.

I did, as I showed above

Sorry, I didn't read your message carefully enough. However, you might well need to set both input and output page size to get it to work.

I accidentally discovered that I was using pdftops and pstopdf (sic), from poppler, not pdf2ps and ps2pdf from GhostScript. They seem to work faster, produce smaller files, and preserve the page size!

Sorry, I can't tell which you find to be the faster: gs or poppler?

Poppler.

@jgclark
Copy link
Author

jgclark commented Aug 21, 2023

Thanks.

However, you might well need to set both input and output page size to get [page size detection] to work.

psnup -2 -P a4 -p a4 testbook.ps testnup.ps also shows the same clipping, when viewed in gs.

Incidentally, psbook seems to be detecting and keeping A4 size, but psnup seems not to be ... (I've not got as far as pstops.)

@rrthomas
Copy link
Owner

Incidentally, psbook seems to be detecting and keeping A4 size, but psnup seems not to be ... (I've not got as far as pstops.)

If you could make a separate bug report for this issue, that would be super!

@rrthomas rrthomas changed the title New 3.09 install has issues Infinite recursion with some PDF inputs Aug 22, 2023
@rrthomas
Copy link
Owner

I have retitled this bug to track the pypdf bug.

@rrthomas
Copy link
Owner

Happy to note that the upstream bug has now been fixed. I will make a new release of PSUtils as soon as they do!

@jgclark
Copy link
Author

jgclark commented Sep 12, 2023

Thanks. Here are my test findings (still on macOS 13.5.2 and python 3.11.4):

  • Builds OK.
  • PYTHONPATH=. python -m psutils.command.psnup -v → psnup.py 3.3.0
  • PYTHONPATH=. python -m psutils.command.psbook -v → psbook.py 3.3.0
  • PYTHONPATH=. python -m psutils.command.psnup -v test.pdf testbook.pdf → 12 pages as expected OK.
  • PYTHONPATH=. python -m psutils.command.psnup -2 testbook.pdf testnup.pdf → "RecursionError: maximum recursion depth exceeded in comparison" again. The full log is mostly repeats of this:
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_base.py", line 295, in clone
    obj.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 115, in clone
    arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 197, in clone
    d__._clone(self, pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 297, in _clone
    v.clone(pdf_dest, force_duplicate, ignore_fields)
  File "/usr/local/lib/python3.11/site-packages/pypdf/generic/_data_structures.py", line 115, in clone
    arr.append(data.clone(pdf_dest, force_duplicate, ignore_fields))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@rrthomas
Copy link
Owner

rrthomas commented Sep 12, 2023

Looks like you're not using pypdf 3.16. Also, you have a psnup command that simply prints the version of psnup where I'm expecting a psbook command. I confirm that it works fine for me with PSUtils 3.3.0 and the following commands, with your test.pdf as attached to this issue report:

$ PYTHONPATH=. python -m psutils.command.psbook test.pdf testbook.pdf
[*] [1] [2] [*] [10] [3] [4] [9] [8] [5] [6] [7] 
Wrote 12 pages
$ PYTHONPATH=. python -m psutils.command.psnup -2 testbook.pdf testnup.pdf
[1,2] [3,4] [5,6] [7,8] [9,10] [11,12] 
Wrote 6 pages

@jgclark
Copy link
Author

jgclark commented Sep 12, 2023

There were no instructions about it in the README, so I thought the update would have been bundled/compiled in to your release.

When you install with pip, it will automatically install the required deps. If you run from a git checkout, it's up to you to install the deps, sorry! I don't cover this in the README because it's not PSUtils-specific, this is just how Python/pip works, and as with many other details, I don't have the time and space to repeat the documentation.

In the meantime I'd tried the PS route (which I think doesn't use that). PYTHONPATH=. python -m psutils.command.psnup -2 testbook.ps testnup.ps → wrote 6 pages. Shows same clipping as before.

Again, works fine for me. If you'd like to pursue this one, please comment on issue #58.

@jgclark
Copy link
Author

jgclark commented Sep 12, 2023

pip install pypdf → "Requirement already satisfied: pypdf in /usr/local/lib/python3.11/site-packages (3.15.2)"

How can I force it to get 3.16?
(Apologies again for not knowing python infrastructure well.)

@rrthomas
Copy link
Owner

pip install --upgrade pypdf.

@jgclark
Copy link
Author

jgclark commented Sep 12, 2023

Phew. That's done the trick. The pdf route is now producing the file I was hoping for.
Thanks for persisting to get this sorted out.

Great! Happy to help, and a pypdf contributor did the crucial work here, along with you by reporting the bug in the first place.

@jgclark
Copy link
Author

jgclark commented Sep 12, 2023

There were no instructions about it in the README, so I thought the update would have been bundled/compiled in to your release.

When you install with pip, it will automatically install the required deps. If you run from a git checkout, it's up to you to install the deps, sorry! I don't cover this in the README because it's not PSUtils-specific, this is just how Python/pip works, and as with many other details, I don't have the time and space to repeat the documentation.

Gotcha.
I've now tried pip install --upgrade pspdfutils. That's done the right thing, and I can see man pages as well.

@rrthomas
Copy link
Owner

Yes, I was just about to point out that to test a release, best to let pip handle the whole thing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants