Bilevel BW png inverted in PDF output #2059

pit65 · 2018-11-16T16:43:29Z

Linux64 Slackware Tesseract 4.0.0

Tesseract PDF output from BW /bilevel type/ png is white on black.

Png is generated from IM or ghostscript .
From BW type grayscale is OK.

jbreiden · 2018-11-16T17:37:27Z

Attach the input PNG please.

pit65 · 2018-11-16T18:19:50Z

Here it isfrom ghostscript

gs -sDEVICE=pngmono -dDOINTERPOLATE -sOutputFile=%03d.png -dSAFER -dBATCH -dNOPAUSE -r300 input.pdf

zdenop · 2018-11-24T09:29:27Z

interesting: tesseract i2059.png i2059 pdf get.image dumpes internal pix as tessinput.tif which is correct...

jbreiden · 2018-12-14T04:41:40Z

Next Leptonica will have a fix. We can also add a workaround in Tesseract which will for force transcode for 1 bit per pixel PNG in pdfrenderer.cpp

…

nguyenq · 2019-06-22T12:13:33Z

I observe the same color inverted problem in PDF output using the latest code.

Version:
tesseract 5.0.0
leptonica-1.78.0 (Mar 23 2019, 10:29:44) [MSC v.1916 DLL Release x86]
libgif 5.1.4 : libjpeg 9c : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.1 : libopenjp2 2.3.0

out.pdf

jbreiden · 2019-07-09T05:17:13Z

Thought we fixed this already in Leptonica. Dan will take a look.

DanBloomberg · 2019-07-09T06:00:20Z

I don't remember what we did. I can't work on this for 2 weeks. Jeff, can you figure out what needs to be done to fix the problem?

zdenop · 2019-07-27T14:17:01Z

@jbreiden @DanBloomberg : issue is still valid for the latest code of tesseract and leptonica.

jbreiden · 2019-08-07T00:55:11Z

https://github.com/DanBloomberg/leptonica/blob/master/src/pdfio2.c#L656 Reproduces in pure Leptonica, using converttopdf from leptonica-progs. Leptonica sees this image is binary (1 bit per sample) and decides to transcode. It calls pixRead() then pixGenerateFlateData(). The input image does have a colormap. Need to do a bit more tracing to see where it goes wrong. Colormap entries: 2 Colormap: 0: ( 0, 0, 0) #000000 gray(0) 1: (255,255,255) #FFFFFF gray(255)

DanBloomberg · 2019-08-08T03:52:27Z

the colormap is opposite to the standard photometry for 1 bpp.

So it's likely that we're just ignoring the fact that there is a colormap, which would give us a video-inverted image. If so, the fix is to check for the colormap and remove it first.

jbreiden · 2019-08-08T05:44:53Z

Or the colormap doesn't make it into the PDF.

…

DanBloomberg · 2019-09-11T00:13:20Z

Looked at the code, both in pngio.c and pdfio2.c.

Leptonica appears to be doing the right thing. When
identify -verbose
indicates a colormap in a b/w png, there is an ambiguity. It may in fact have a colormap, or it may not, in which case 'identify' pretends that it does. For a b/w pix without a colormap, png stores the values inverted, so the png reader must invert the pixel value. However, if the 1 bpp png actually has a colormap, the reader removes it, and in doing so, it compares the color components of index 0 and 1. For a b/w colormap, it inverts if 1 is 255 and 0 is 0. (Removal of the colormap must be lossless -- i.e., not change the values of the pixels)

So much for png. For pdf generation, as Jeff mentioned, a 1 bpp image will be transcoded. The first step is reading the image. As described above, a b/w pix (with or without a colormap), after being written to png and read back to a pix, is unchanged. The pdf generator doesn't change the polarity either.

If someone can find a 1 bpp (b/w) image, as a png file, with or without a colormap, that gets inverted as a pdf (e.g., using convertToPdf), please attach it to this issue.

DanBloomberg · 2019-09-11T00:22:14Z

Perhaps the problem is in tesseract. Are the 1 bpp images being transcoded?

zdenop · 2019-09-11T07:42:32Z

@jbreiden: convertToPdf with above mentioned images create "not inverted" pdf.
tessinput.tif created during
tesseract 48639675-31a07680-e9cc-11e8-9e47-00130ea102c1.png a get.images pdf is correct...

zdenop · 2019-10-18T18:04:05Z

@jbreiden: did you have a chance to have a look at this?

jbreiden · 2019-10-25T00:34:54Z

Taking time now to look again. Let's use this image and trace. https://user-images.githubusercontent.com/1501035/59963730-d03e0a00-94bc-11e9-804a-f7f20e17bb4e.png pixGenerateCIData runs, which dispatches to pixGenerateFlateData. It gets back this: cid->type = L_FLATE_ENCODE. cid->ncolors = 0 That causes us to write a PDF object containing this metadata, which I think is the only possible choice. /ColorSpace /DeviceGray /BitsPerComponent 1 /Filter /FlateDecode /DecodeParms << /Colors 1 /BitsPerComponent 1 >> I think this isolates where we are going wrong. Either the pixels need to be inverted, or we need to keep the colormap. Either way, I think the answer is going to be in pixGenerateFlateData.

jbreiden · 2019-10-25T18:08:28Z

Looking even more carefully at code in pdfrenderer.cpp, I see that all PNG files are routed through pixGenerateCIData(). https://github.com/tesseract-ocr/tesseract/blob/master/src/api/pdfrenderer.cpp#L679 if (pixGetInputFormat(pix) == IFF_PNG) sad = pixGenerateCIData(pix, L_FLATE_ENCODE, 0, 0, &cid); That is the proximate cause of this bug. When I originally wrote this code, my intention was to route as much PNG as possible through l_generateCIDataForPdf, which is the most "hands off" code path and handles this image just fine. This is what the code looked like last time I touched it, on 2016-07-21. if (pixGetSpp(pix) == 4 && format == IFF_PNG) { Pix *p1 = pixAlphaBlendUniform(pix, 0xffffff00); sad = pixGenerateCIData(p1, L_FLATE_ENCODE, 0, 0, &cid); pixDestroy(&p1); } else { sad = l_generateCIDataForPdf(filename, pix, kJpegQuality, &cid); } My suggestion is for solving this bug is: (a) Review change history and find out why Tesseract started routing all PNG to pixGenerateCIData (b) Set everything (or as much as possible) back to l_generateCIDataForPdf (c) Test with the various tricky flavors of PNG from bug reports over the years (d) Once all is well, lock it down with a unit test

DanBloomberg · 2019-10-25T21:44:33Z

Thank you for looking at this, Jeff.

I haven't been able to prove that the change you found is responsible for the inversion problem.
Someone needs to substitute l_generateCIDataForPdf() for pixGenerateCIData() to see if the problem is fixed.

I did a little experimentation with that image. The test program is attached (I had to call it ".txt" to upload it).

Calling pixGenerateCIData() on a 1 bpp pix with a colormap generates a proper pdf with the colormap embedded.

Calling l_generateCIDataForPdf() from a png file of 1 bpp image with a colormap: causes the colormap to be removed, and then generates a proper pdf.

pdftest_invert.txt

jbreiden · 2019-10-26T00:06:20Z

Someone needs to substitute l_generateCIDataForPdf() for

pixGenerateCIData() to see if the problem is fixed. I did. It works for the image in this bug. However, I did not test against the images from other bugs, e.g. #1914 #1361 [ I'm sure there were some more, not sure how to find them ]

Calling pixGenerateCIData() on a 1 bpp pix with a colormap generates a

proper pdf with the colormap embedded. This contradicts my tests from yesterday. Send me the output PDF, please.

DanBloomberg · 2019-10-26T01:20:23Z

The attached pdf with a colormap is generated in lines 35-38 of pdftest_invert (attached above).

bitinvert1.pdf

zdenop · 2019-10-26T14:25:10Z

@jbreiden : I am afraid problem is somewhere else: regardless I used l_generateCIDataForPdf or pixGenerateCIData I got inverted image in pdf.

So I made simple test: create pdf from cid with:

  l_uint8 *test_data;
  size_t test_nbytes;
  cidConvertToPdfData(cid, "testing issue 2059", &test_data, &test_nbytes);
  l_binaryWrite("test_i2059.pdf", "w", test_data, test_nbytes);
  delete test_data;
  printf("test finished.\n");

after

tesseract/src/api/pdfrenderer.cpp

Lines 678 to 683 in 048f729

    
           int sad = 0; 
        
           if (pixGetInputFormat(pix) == IFF_PNG) 
        
             sad = pixGenerateCIData(pix, L_FLATE_ENCODE, 0, 0, &cid); 
        
           if (!cid) { 
        
             sad = l_generateCIDataForPdf(filename, pix, jpg_quality, &cid); 
        
           }

and test_i2059.pdf has not inverted image...

zdenop · 2019-10-26T15:11:59Z

@jbreiden: when I add to result pdf /Decode [1 0] to PDF object metadata after /ColorSpace /DeviceGray - result image is shown correctly. See i2059_org.pdf and i2059_fixed.pdf.

jbreiden · 2019-10-26T20:48:47Z

@zdenop is right. This "/Decode [1 0]" was the missing piece of the puzzle. An immediate fix is to add that to pdfrender.cpp in Tesseract, similar to how Leptonica does it. https://github.com/tesseract-ocr/tesseract/blob/master/src/api/pdfrenderer.cpp#L724 https://github.com/DanBloomberg/leptonica/blob/master/src/pdfio2.c#L1953 However, I would still like to send most or all PNG files to l_generateCIDataForPdf,to avoid transcoding. It would be super helpful to have various flavors on PNG on hand to test, such as: (1) 1 bit per pixel, not indexed (2) 1 bit per pixel, indexed with 0 = black, 1 = white (from this bug) (3) 1 bit per pixel, indexed with 1 = black, 0 = white (4) 1 bit per pixel, indexed with 2 colors that are not black or white (5) That fun image from #369

zdenop · 2019-10-27T13:28:10Z

fixed with 4a37cde.

zdenop · 2019-10-27T13:34:57Z

(4) - is it possible to create such png? I tried convert 3_colors_no_bw.png -colors 2 +dither image.bmp (irfanview) reports is has 2 colors (1 BitsPerPixel). When I tried convert 3_colors_no_bw.png -colors 2 +dither image.png irfanview reports is has 16,7 Million (24 BitsPerPixel)...

DanBloomberg · 2019-10-27T15:50:34Z

I'm not sure what you're trying to do, but if you add dithering, the value of every pixel depends on its neighbors, and if there is color, there are in general no restrictions on the rgb values that the pixels can achieve. The only way to represent such an image is in a full color space, because colormaps with 8 bit pixels can only have 256 different values.

…

On Sun, Oct 27, 2019 at 6:36 AM zdenop ***@***.***> wrote: (4) - is it possible to create such png? I tried convert 3_colors_no_bw.png -colors 2 +dither image.bmp (irfanview) reports is has 2 colors (1 BitsPerPixel). When I tried convert 3_colors_no_bw.png -colors 2 +dither image.png irfanview reports is has 16,7 Million (24 BitsPerPixel)... [image: 3_colors_no_bw] <https://user-images.githubusercontent.com/574156/67635398-eb83b080-f8c6-11e9-8457-1f73c66f9503.png> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2059?email_source=notifications&email_token=AD7KMLFHGC7UW6QYRZHO3B3QQWDMFA5CNFSM4GE2MKZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECK6QJQ#issuecomment-546695206>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7KMLE3DSEN4Y7YEHG3ODTQQWDMFANCNFSM4GE2MKZQ> .

zdenop · 2019-10-27T15:57:32Z

Jeff wants in (4) png "1 bit per pixel, indexed with 2 colors that are not black or white" image

DanBloomberg · 2019-10-27T16:08:43Z

you can make it with leptonica, as I did in the test program I attached to this bug. Read the 1 bpp image. Make a colormap consisting of two colors with any rgb values. The first color is applied to the 0 pixels, the second to the 1 pixels. Then add the colormap to the pix and save as png. If you want, I'll do this tomorrow with your cat and mouse story image (btw, I don't entirely trust that cat ...)

…

On Sun, Oct 27, 2019 at 8:58 AM zdenop ***@***.***> wrote: Jeff wants in (4) png "1 bit per pixel, indexed with 2 colors that are not black or white" image — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2059?email_source=notifications&email_token=AD7KMLCVG7QQF7ULAE4DYZ3QQWUDFA5CNFSM4GE2MKZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECLBTFA#issuecomment-546707860>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7KMLEDVEONUHITKIMRM5LQQWUDFANCNFSM4GE2MKZQ> .

zdenop · 2019-10-27T16:14:12Z

It would be great Dan. BTW: text is begin of fairy tale CAT AND MOUSE IN PARTNERSHIP By The Brothers Grimm.

jbreiden · 2019-10-27T17:21:31Z

Goal is to make a set of test images, and find out what types of PNG can safely move from pixGenerateCIData to l_generateCIDataForPdf. If we are super lucky, it will be everything. PS. I feel like I am living in a fairy tale with today's power outage. <https://www.sfchronicle.com/california-wildfires/article/PG-E-outages-loom-for-up-to-2-million-in-14564579.php>

DanBloomberg · 2019-10-28T18:49:44Z

@zdenop

Took your cat-and-mouse example image (antialiased, with 16 colors) and converted it to 1 bpp with colors similar to those in the original, using:

pix1 = pixRead("cat-and-mouse-orig.png");
pix2 = pixConvertTo32(pix1);
pix3 = pixConvertTo1(pix2, 225);
cmap = pixcmapCreate(1);
pixcmapAddColor(cmap, 254, 240, 185);
pixcmapAddColor(cmap, 50, 50, 130);
pixSetColormap(pix3, cmap);
pixWrite("cat-and-mouse.png", pix3, IFF_PNG);

The output image is attached.

DanBloomberg · 2019-10-30T00:52:40Z

I've incorporated some tests into the leptonica test prog/pdfio1_reg.c

Namely, lines 221 - 272. It has a new function I wrote that converts an RGB image to a colormapped one, if the number of colors <= 256.
The reason for this function is that the image cat-and-mouse.png has a colormap, but when it is read in to make a pix, it is converted into a 32 bpp RGB. This function converts it back to a 1 bpp pix.
pdftest.txt

The input image is

zdenop · 2019-11-04T08:47:09Z

@jbreiden : will you be able to create unittests for this?

ShinjiLE · 2019-11-04T13:45:12Z

Ahoi !
Since commit 4a37cde the output is inverted for my images. The images are created by scantailor .

inverted result.pdf

MSR - messen steuern regeln 5 1984 0018_1L.tar.gz

zdenop · 2019-11-04T14:06:13Z

@ShinjiLE : thanks for example. Your files are missing part for this puzzle. In this case there should be /Decode [0 1] instead of /Decode [1 0].
I wonder if this can not be handled by leptonica instead for coding it in tesseract...

jbreiden · 2019-11-06T16:41:02Z

Thanks for problem report, this is why a test suite is so important. @zdenop can you please narrow 4a37cde <4a37cde> with the following: if (cid->bps == 1 && pixGetInputFormat(pix) == IFF_PNG) colorspace.str(" /ColorSpace /DeviceGray\n" " /Decode [1 0]\n");

zdenop · 2019-11-07T12:44:05Z

@DanBloomberg @jbreiden: what about create function at leptonica for this? It would take cid as argument that will return string with all needed information? I feel like we will copy this part of leptonica to tesseract step by step and I think it would be better to manage (possible) problem in place (in leptonica).

DanBloomberg · 2019-11-07T18:33:13Z

I haven't looked carefully at the tesseract code, and do not know why there is this apparent duplication. Once the cid (L_COMP_DATA struct) is built, there is just one call to make the pdf data: cidConvertToPdfData() This makes an lpd (L_PDF_DATA struct), and launches l_generatePdf(&data, &size, lpd) to make the pdf string. l_generatePdf() uses this lpd to sequentially call: generateFixedStringsPdf(lpd); generateMediaboxPdf(lpd); generatePageStringPdf(lpd); generateContentStringPdf(lpd); generatePreXStringsPdf(lpd); generateColormapStringsPdf(lpd); generateTrailerPdf(lpd); generateOutputDataPdf(pdata, pnbytes, lpd); I agree it would be nice if tesseract can use as many of these generators as possible.

…

On Thu, Nov 7, 2019 at 4:45 AM zdenop ***@***.***> wrote: @DanBloomberg <https://github.com/DanBloomberg> @jbreiden <https://github.com/jbreiden>: what about create function at leptonica for this? It would take cid as argument that will return string with all needed information? I feel like we will copy this part of leptonica <https://github.com/DanBloomberg/leptonica/blob/8d696ce2997e68e0cf8a7fe1d175fe2584f549c0/src/pdfio2.c#L1914-L2007> to tesseract step by step and I think it would be better to manage (possible) problem in place (in leptonica). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2059?email_source=notifications&email_token=AD7KMLDXLWXNTPFLDHTCFATQSQEW5A5CNFSM4GE2MKZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDMJD4I#issuecomment-551064049>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD7KMLCLBQKQ6BPWN7J74DTQSQEW5ANCNFSM4GE2MKZQ> .

jbreiden · 2019-11-08T08:08:39Z

I am sick, but willing and (hopefully) able to spend time figuring this out gracefully when I feel better. In meantime g4 Tiff is broken and that is a very important format. Please narrow as per my previous comment. I can't do it myself because I lost my GitHub 2-factor authentication.

zdenop · 2019-11-10T15:12:37Z

understand. done. 2d6f38e

stweil · 2019-11-20T06:49:42Z

There was a new issue #2771 reported which has a similar problem, but with TIFF instead of PNG. It claims that 4.1 worked correctly, so that might be different.

zdenop added PDF output issues related output formats labels Nov 24, 2018

zdenop mentioned this issue Nov 18, 2019

Regression: inverted image with pdf output mode and monochrome input TIFF #2771

Closed

stweil changed the title ~~Bilevel BW png~~ Bilevel BW png inverted in PDF output Nov 20, 2019

amitdo added the leptonica label Mar 21, 2021

Bilevel BW png inverted in PDF output #2059

Bilevel BW png inverted in PDF output #2059

Comments

pit65 commented Nov 16, 2018 • edited Loading

jbreiden commented Nov 16, 2018

pit65 commented Nov 16, 2018 • edited Loading

zdenop commented Nov 24, 2018

jbreiden commented Dec 14, 2018 via email

nguyenq commented Jun 22, 2019

jbreiden commented Jul 9, 2019 via email

DanBloomberg commented Jul 9, 2019

zdenop commented Jul 27, 2019

jbreiden commented Aug 7, 2019 via email

DanBloomberg commented Aug 8, 2019

jbreiden commented Aug 8, 2019 via email

DanBloomberg commented Sep 11, 2019

DanBloomberg commented Sep 11, 2019

zdenop commented Sep 11, 2019

zdenop commented Oct 18, 2019

jbreiden commented Oct 25, 2019 via email

jbreiden commented Oct 25, 2019 via email

DanBloomberg commented Oct 25, 2019 • edited Loading

jbreiden commented Oct 26, 2019 via email

DanBloomberg commented Oct 26, 2019

zdenop commented Oct 26, 2019

zdenop commented Oct 26, 2019

jbreiden commented Oct 26, 2019 via email

zdenop commented Oct 27, 2019

zdenop commented Oct 27, 2019

DanBloomberg commented Oct 27, 2019 via email

zdenop commented Oct 27, 2019

DanBloomberg commented Oct 27, 2019 via email

zdenop commented Oct 27, 2019

jbreiden commented Oct 27, 2019 via email

DanBloomberg commented Oct 28, 2019 • edited Loading

DanBloomberg commented Oct 30, 2019

zdenop commented Nov 4, 2019

ShinjiLE commented Nov 4, 2019

zdenop commented Nov 4, 2019

jbreiden commented Nov 6, 2019 via email

zdenop commented Nov 7, 2019

DanBloomberg commented Nov 7, 2019 via email

jbreiden commented Nov 8, 2019 via email

zdenop commented Nov 10, 2019

stweil commented Nov 20, 2019

pit65 commented Nov 16, 2018 •

edited

Loading

pit65 commented Nov 16, 2018 •

edited

Loading

DanBloomberg commented Oct 25, 2019 •

edited

Loading

DanBloomberg commented Oct 28, 2019 •

edited

Loading