-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bilevel BW png inverted in PDF output #2059
Comments
Attach the input PNG please. |
interesting: |
Next Leptonica will have a fix. We can also add a workaround in Tesseract
which will for force transcode for 1 bit per pixel PNG in pdfrenderer.cpp
… |
I observe the same color inverted problem in PDF output using the latest code. Version: |
Thought we fixed this already in Leptonica. Dan will take a look.
|
I don't remember what we did. I can't work on this for 2 weeks. Jeff, can you figure out what needs to be done to fix the problem? |
@jbreiden @DanBloomberg : issue is still valid for the latest code of tesseract and leptonica. |
https://github.com/DanBloomberg/leptonica/blob/master/src/pdfio2.c#L656
Reproduces in pure Leptonica, using converttopdf from leptonica-progs.
Leptonica sees this image is binary
(1 bit per sample) and decides to transcode. It calls pixRead()
then pixGenerateFlateData(). The input image
does have a colormap. Need to do a bit more tracing to see where it goes
wrong.
Colormap entries: 2
Colormap:
0: ( 0, 0, 0) #000000 gray(0)
1: (255,255,255) #FFFFFF gray(255)
|
the colormap is opposite to the standard photometry for 1 bpp. So it's likely that we're just ignoring the fact that there is a colormap, which would give us a video-inverted image. If so, the fix is to check for the colormap and remove it first. |
Or the colormap doesn't make it into the PDF.
… |
Looked at the code, both in pngio.c and pdfio2.c. Leptonica appears to be doing the right thing. When So much for png. For pdf generation, as Jeff mentioned, a 1 bpp image will be transcoded. The first step is reading the image. As described above, a b/w pix (with or without a colormap), after being written to png and read back to a pix, is unchanged. The pdf generator doesn't change the polarity either. If someone can find a 1 bpp (b/w) image, as a png file, with or without a colormap, that gets inverted as a pdf (e.g., using convertToPdf), please attach it to this issue. |
Perhaps the problem is in tesseract. Are the 1 bpp images being transcoded? |
@jbreiden: did you have a chance to have a look at this? |
Taking time now to look again. Let's use this image and trace.
https://user-images.githubusercontent.com/1501035/59963730-d03e0a00-94bc-11e9-804a-f7f20e17bb4e.png
pixGenerateCIData runs, which dispatches to pixGenerateFlateData. It gets
back this:
cid->type = L_FLATE_ENCODE.
cid->ncolors = 0
That causes us to write a PDF object containing this metadata, which I
think is the only possible choice.
/ColorSpace /DeviceGray
/BitsPerComponent 1
/Filter /FlateDecode
/DecodeParms << /Colors 1 /BitsPerComponent 1 >>
I think this isolates where we are going wrong. Either the pixels need to
be inverted, or we need to keep the colormap. Either way, I think the
answer is going to be in pixGenerateFlateData.
|
Looking even more carefully at code in pdfrenderer.cpp, I see that all PNG
files are routed through pixGenerateCIData().
https://github.com/tesseract-ocr/tesseract/blob/master/src/api/pdfrenderer.cpp#L679
if (pixGetInputFormat(pix) == IFF_PNG)
sad = pixGenerateCIData(pix, L_FLATE_ENCODE, 0, 0, &cid);
That is the proximate cause of this bug. When I originally wrote this code,
my intention was to route as much PNG as possible
through l_generateCIDataForPdf, which is the most "hands off" code path and
handles this image just fine. This is what the code looked like last time I
touched it, on 2016-07-21.
if (pixGetSpp(pix) == 4 && format == IFF_PNG) {
Pix *p1 = pixAlphaBlendUniform(pix, 0xffffff00);
sad = pixGenerateCIData(p1, L_FLATE_ENCODE, 0, 0, &cid);
pixDestroy(&p1);
} else {
sad = l_generateCIDataForPdf(filename, pix, kJpegQuality, &cid);
}
My suggestion is for solving this bug is:
(a) Review change history and find out why Tesseract started routing all
PNG to pixGenerateCIData
(b) Set everything (or as much as possible) back to l_generateCIDataForPdf
(c) Test with the various tricky flavors of PNG from bug reports over the
years
(d) Once all is well, lock it down with a unit test
|
Thank you for looking at this, Jeff. I haven't been able to prove that the change you found is responsible for the inversion problem. I did a little experimentation with that image. The test program is attached (I had to call it ".txt" to upload it). Calling pixGenerateCIData() on a 1 bpp pix with a colormap generates a proper pdf with the colormap embedded. Calling l_generateCIDataForPdf() from a png file of 1 bpp image with a colormap: causes the colormap to be removed, and then generates a proper pdf. |
Someone needs to substitute l_generateCIDataForPdf() for
pixGenerateCIData() to see if the problem is fixed.
I did. It works for the image in this bug. However, I did not test against
the images from other bugs, e.g.
#1914
#1361
[ I'm sure there were some more, not sure how to find them ]
Calling pixGenerateCIData() on a 1 bpp pix with a colormap generates a
proper pdf with the colormap embedded.
This contradicts my tests from yesterday. Send me the output PDF, please.
|
The attached pdf with a colormap is generated in lines 35-38 of pdftest_invert (attached above). |
@jbreiden : I am afraid problem is somewhere else: regardless I used So I made simple test: create pdf from cid with: l_uint8 *test_data;
size_t test_nbytes;
cidConvertToPdfData(cid, "testing issue 2059", &test_data, &test_nbytes);
l_binaryWrite("test_i2059.pdf", "w", test_data, test_nbytes);
delete test_data;
printf("test finished.\n"); after tesseract/src/api/pdfrenderer.cpp Lines 678 to 683 in 048f729
and test_i2059.pdf has not inverted image... |
@jbreiden: when I add to result pdf |
@zdenop is right. This "/Decode [1 0]" was the missing piece of the
puzzle. An immediate fix is to add that to pdfrender.cpp in Tesseract,
similar to how Leptonica does it.
https://github.com/tesseract-ocr/tesseract/blob/master/src/api/pdfrenderer.cpp#L724
https://github.com/DanBloomberg/leptonica/blob/master/src/pdfio2.c#L1953
However, I would still like to send most or all PNG files to
l_generateCIDataForPdf,to avoid transcoding. It would be super helpful to
have various flavors on PNG on hand to test, such as:
(1) 1 bit per pixel, not indexed
(2) 1 bit per pixel, indexed with 0 = black, 1 = white (from this bug)
(3) 1 bit per pixel, indexed with 1 = black, 0 = white
(4) 1 bit per pixel, indexed with 2 colors that are not black or white
(5) That fun image from
#369
|
fixed with 4a37cde. |
I'm not sure what you're trying to do, but if you add dithering, the value
of every pixel depends on its neighbors, and if there is color, there are
in general no restrictions on the rgb values that the pixels can achieve.
The only way to represent such an image is in a full color space, because
colormaps with 8 bit pixels can only have 256 different values.
…On Sun, Oct 27, 2019 at 6:36 AM zdenop ***@***.***> wrote:
(4) - is it possible to create such png? I tried convert
3_colors_no_bw.png -colors 2 +dither image.bmp (irfanview) reports is has
2 colors (1 BitsPerPixel). When I tried convert 3_colors_no_bw.png
-colors 2 +dither image.png irfanview reports is has 16,7 Million (24
BitsPerPixel)...
[image: 3_colors_no_bw]
<https://user-images.githubusercontent.com/574156/67635398-eb83b080-f8c6-11e9-8457-1f73c66f9503.png>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2059?email_source=notifications&email_token=AD7KMLFHGC7UW6QYRZHO3B3QQWDMFA5CNFSM4GE2MKZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECK6QJQ#issuecomment-546695206>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD7KMLE3DSEN4Y7YEHG3ODTQQWDMFANCNFSM4GE2MKZQ>
.
|
Jeff wants in (4) png "1 bit per pixel, indexed with 2 colors that are not black or white" image |
you can make it with leptonica, as I did in the test program I attached to
this bug.
Read the 1 bpp image. Make a colormap consisting of two colors with any
rgb values.
The first color is applied to the 0 pixels, the second to the 1 pixels.
Then add the
colormap to the pix and save as png.
If you want, I'll do this tomorrow with your cat and mouse story image
(btw, I don't entirely trust that cat ...)
…On Sun, Oct 27, 2019 at 8:58 AM zdenop ***@***.***> wrote:
Jeff wants in (4) png "1 bit per pixel, indexed with 2 colors that are not
black or white" image
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2059?email_source=notifications&email_token=AD7KMLCVG7QQF7ULAE4DYZ3QQWUDFA5CNFSM4GE2MKZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECLBTFA#issuecomment-546707860>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD7KMLEDVEONUHITKIMRM5LQQWUDFANCNFSM4GE2MKZQ>
.
|
It would be great Dan. BTW: text is begin of fairy tale CAT AND MOUSE IN PARTNERSHIP By The Brothers Grimm. |
Goal is to make a set of test images, and find out what types of PNG can
safely move from pixGenerateCIData to l_generateCIDataForPdf. If we are
super lucky, it will be everything.
PS. I feel like I am living in a fairy tale with today's power outage.
<https://www.sfchronicle.com/california-wildfires/article/PG-E-outages-loom-for-up-to-2-million-in-14564579.php>
|
Took your cat-and-mouse example image (antialiased, with 16 colors) and converted it to 1 bpp with colors similar to those in the original, using:
|
I've incorporated some tests into the leptonica test prog/pdfio1_reg.c Namely, lines 221 - 272. It has a new function I wrote that converts an RGB image to a colormapped one, if the number of colors <= 256. |
@jbreiden : will you be able to create unittests for this? |
Ahoi ! |
@ShinjiLE : thanks for example. Your files are missing part for this puzzle. In this case there should be |
@DanBloomberg @jbreiden: what about create function at leptonica for this? It would take cid as argument that will return string with all needed information? I feel like we will copy this part of leptonica to tesseract step by step and I think it would be better to manage (possible) problem in place (in leptonica). |
I haven't looked carefully at the tesseract code, and do not know why there
is this apparent duplication.
Once the cid (L_COMP_DATA struct) is built, there is just one call to make
the pdf data:
cidConvertToPdfData()
This makes an lpd (L_PDF_DATA struct), and launches
l_generatePdf(&data, &size, lpd)
to make the pdf string. l_generatePdf() uses this lpd to sequentially
call:
generateFixedStringsPdf(lpd);
generateMediaboxPdf(lpd);
generatePageStringPdf(lpd);
generateContentStringPdf(lpd);
generatePreXStringsPdf(lpd);
generateColormapStringsPdf(lpd);
generateTrailerPdf(lpd);
generateOutputDataPdf(pdata, pnbytes, lpd);
I agree it would be nice if tesseract can use as many of these generators
as possible.
…On Thu, Nov 7, 2019 at 4:45 AM zdenop ***@***.***> wrote:
@DanBloomberg <https://github.com/DanBloomberg> @jbreiden
<https://github.com/jbreiden>: what about create function at leptonica
for this? It would take cid as argument that will return string with all
needed information? I feel like we will copy this part of leptonica
<https://github.com/DanBloomberg/leptonica/blob/8d696ce2997e68e0cf8a7fe1d175fe2584f549c0/src/pdfio2.c#L1914-L2007>
to tesseract step by step and I think it would be better to manage
(possible) problem in place (in leptonica).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2059?email_source=notifications&email_token=AD7KMLDXLWXNTPFLDHTCFATQSQEW5A5CNFSM4GE2MKZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDMJD4I#issuecomment-551064049>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD7KMLCLBQKQ6BPWN7J74DTQSQEW5ANCNFSM4GE2MKZQ>
.
|
I am sick, but willing and (hopefully) able to spend time figuring this out
gracefully when I feel better. In meantime g4 Tiff is broken and that is a
very important format. Please narrow as per my previous comment. I can't do
it myself because I lost my GitHub 2-factor authentication.
|
understand. done. 2d6f38e |
There was a new issue #2771 reported which has a similar problem, but with TIFF instead of PNG. It claims that 4.1 worked correctly, so that might be different. |
Linux64 Slackware Tesseract 4.0.0
Tesseract PDF output from BW /bilevel type/ png is white on black.
Png is generated from IM or ghostscript .
From BW type grayscale is OK.
The text was updated successfully, but these errors were encountered: