Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

best compression settings for a wsi viewer #490

Open
sinamcr7 opened this issue Aug 3, 2024 · 38 comments
Open

best compression settings for a wsi viewer #490

sinamcr7 opened this issue Aug 3, 2024 · 38 comments

Comments

@sinamcr7
Copy link

sinamcr7 commented Aug 3, 2024

hi, I have a project to make wsi images using frames captured from a camera on microscope, my maker code uses opencv and numpy to store final array and update it with every frame
I have some questions:
1- when I'm saving final file what's best settings to save as pyramid and make it suitable to show in a viewer for analysis without quality loss?(I was saving with opencv imwrite function first but when I needed to rotate final image it takes double memory to make rotated image then saves it so I decided to do this with pyvips)
2-I wanted to know if there is any other method that can speedup this process? or reduce memory usage because right now I'm making a 50k x 50k * 3 array at start of program and just update it or pad it when i get to edges, and it takes 7.5gb memory and when pad is running it goes up to double memory but after pad is finished it goes down to real size.
3-is it normal when I try to open these images with windows photo viewer it takes several seconds to minutes to open and show it?
this is the code I'm using to rotate and save the tiff image saved with opencv imwrite:

image = pyvips.Image.new_from_file('../wsi.tiff')
image = image.rotate(90)
image.tiffsave('../wsi2.tiff', tile=True, compression='jpeg', bigtiff=False, pyramid=True, Q=100, tile_width=256, tile_height=256, properties=True)
@jcupitt
Copy link
Member

jcupitt commented Aug 3, 2024

Hello @sinamcr7,

  1. I think I'd use Q=85, that's roughly what slide scanner companies use. You could possibly use 512x512 tiles, though 256x256 is a good choice. 1024x1024 is probably too large.

  2. You could move more of your array assembly processing to pyvips, it should speed it up and drop memory use. I would look at merge and friends:

    https://www.libvips.org/API/current/libvips-mosaicing.html#vips-merge

    It does a pair-wise join of two images with a feathered edge. You can merge your $n camera images and save as pyramidal tiff in one step with no intermediates.

  3. Windows photo viewer is not designed for large images. It will just decompress the entire thing to RAM and then paint the screen from that, so it'll be very slow and need colossal amounts of memory. I made a simple image viewer which should be quick:

    https://github.com/jcupitt/vipsdisp

    There's a windows build here:

    https://github.com/jcupitt/vipsdisp/releases/tag/v3.0.4

    Just unzip somewhere and doubleclick the exe. The keyboard shortcuts are useful:

    https://github.com/jcupitt/vipsdisp?tab=readme-ov-file#shortcuts

    It supports things like colour management for slide images, which can be handy if you have a profile for your camera and microscope.

@jcupitt
Copy link
Member

jcupitt commented Aug 3, 2024

I've just realised you already have the entire image in memory as a numpy array, is that right?

In which case you can simply do:

image = pyvips.Image.new_from_array(big_numpy_array)
image = image.rot90()
image.tiffsave("some-filename.tif",
               compression="jpeg",
               Q=85,
               tile=True,
               tile_width=256,
               tile_height=256,
               pyramid=True)

libvips and numpy will share the memory, so there will (I think!) be no copy and no extra memuse.

.rot90() does a fixed 90 degree rotate, so it's faster and more accurate than .rotate(90).

You could also use pyvips to save as OME-TIFF, though it'll be slower. Have you looked at QuPath?

@sinamcr7
Copy link
Author

sinamcr7 commented Aug 4, 2024

thanks for suggestions
yes I make image array at program launch so it takes 7.5gb memory at start, its to reduce need for padding later, is it better to convert all operations to vips or opencv is enough?
no I didn't try QyPath yet
I'm interested in using vips for making wsi if it reduces memory usage, I have an gui with pyqt to show windows while making wsi, currently when I reach edges I have to pad with np.pad and it takes some time to do it so it adds delays to gui, so I chose a big array with a optimal starting point to avoid pads as much as I can, is there any workaround to solve this issue?

@jcupitt
Copy link
Member

jcupitt commented Aug 4, 2024

I'd stick to opencv if your code is working. pyvips ought to be able to make the pyramidal tiff directly from the numpy array with only a relatively small amount of extra memory.

It depends how you are making the slide image. What corrections do you apply to frames from the microscope? How accurate is your stage? We'd need to get into a lot of detail before I could answer.

@sinamcr7
Copy link
Author

sinamcr7 commented Aug 4, 2024

ok thanks then I'll keep opencv part, I also have issue with np.pad, when my app reaches that it takes double of array memory and after several seconds it goes down to actual memory that it should take, is there any faster and low memory method to pad an array?

@jcupitt
Copy link
Member

jcupitt commented Aug 4, 2024

Sorry, no, array resize needs to reallocate memory, and that means it must double.

You could change to a tiled array. Cut your image into eg. 1024x1024 tiles and keep a meta-array of references to them. Now you can pad by just making a new column of tiles on the right and nulling the references to the old column of tiles on the left.

@jcupitt
Copy link
Member

jcupitt commented Aug 4, 2024

... though a tiled array will make save difficult, of course.

I've built several imaging systems like this. I've always done it in two stages: first, scan the camera over the slide and capture a set of frames to your storage. You can keep a low-res version in memory to show the user what the slide looks like during the scan. You can do geometric and radiometric correction at this stage. You examine the overlap areas and estimate the frame positions from the content.

Second, have an "assembly" phase which reads the corrected tiles from your storage, merges them together using offsets computed in stage 1, possibly does some extra corrections for lighting consistency or colour, and saves as a pyramidal tiff. pyvips would be a reasonable choice for the assembly stage.

Commercial slide scanners usually work in the same way, though some very high throughput systems will have a huge area of RAM for assembly rather than temporary files on your drive.

@sinamcr7
Copy link
Author

sinamcr7 commented Aug 5, 2024

well our algorithm needs that big array for some calculations, if we remove that part I don't know how I should compute exact locations and overlaps, also without that how can I show thumbnail and current frame to user?

also I tried vips tiffsave with jpeg and q=85, q=100 and opencv imwrite, but there wasn't much difference in quality and it seems opencv imwrite has more quality, is that right or I did something wrong? here's code:

cv2.imwrite('wsi.tiff', self.image)
image = cv2.cvtColor(self.image, cv2.COLOR_BGR2RGB)
image = numpy_to_pyvips(image)
image = image.rot90()
image.tiffsave('wsi2.tiff', tile=True, compression='jpeg', pyramid=True, Q=85, tile_width=256, tile_height=256, properties=True)
image.tiffsave('wsi3.tiff', tile=True, compression='jpeg', pyramid=True, Q=100, tile_width=256, tile_height=256, properties=True)

@jcupitt
Copy link
Member

jcupitt commented Aug 5, 2024

The scanners I've made have driven the stage in an approximate grid, then examined the overlaps to find a set of exact offsets. The set of offsets are an overdetermined linear system, so you can use eg. least mean square to find a set of frame positions which minimise overall positioning error.

Friends have made system which calibrate the stepper motors instead, so they use a known target, then drive the stage over the field grabbing frames and refining a pair of XY positioning tables. This can work very well, but calibration takes a long time and you have to redo it fairly often as the threads in the stages wear.

You can keep eg. a 10k x 10k image in memory plus the current frame to show the user, then generate the full 100k x 100k image during save.

The settings mostly affect file size. I see:

$ vips copy CMU-1.svs x-85.tif[compression=jpeg,tile,pyramid,Q=85]
$ vips copy CMU-1.svs x-100.tif[compression=jpeg,tile,pyramid,Q=100]
$ vips copy CMU-1.svs x.tif
$ ls -l x* CMU-1.svs
-rw-r--r-- 1 john john  794542522 Aug  5 13:37 x-100.tif
-rw-r--r-- 1 john john  174346312 Aug  5 13:33 x-85.tif
-rw-r--r-- 1 john john 6056321460 Aug  5 13:38 x.tif
-rw-r--r-- 1 john john  177552579 Feb 13  2021 CMU-1.svs

The uncompressed TIFF will obviously be the best quality, but 6GB is far too large to be practical. Almost no users will want to use it. Q100 is 10x smaller, Q85 is 40x smaller. Aperio (the slide scanner that makes SVS format slides) is using Q85.

@sinamcr7
Copy link
Author

sinamcr7 commented Aug 5, 2024

oh you're talking about motorized scanners, I'm working with manual wsi scanning, user should be able to see area around FOV, FOV itself for focus check and a thumbnail of big image so user can see when its reaching edges and need to wait for pad, and if there is any errors he will use tools to clear wrong frames up to last correct frame, then start scanning again

@jcupitt
Copy link
Member

jcupitt commented Aug 5, 2024

Ah OK, I've never tried with a manual stage, I agree that would need a different approach.

Other friends have made interactive scanners using a manual stage, but I don't know what technical solution they used. Probably allocating the complete image in memory before starting.

@nahilsobh
Copy link

nahilsobh commented Sep 20, 2024

Hi @jcupitt --

looking back at release libvips version 8.15.1 I noticed that Q >=90 is not supported.

But in the statement above by @sinamcr7 (copied here):

image.tiffsave('wsi3.tiff', tile=True, compression='jpeg', pyramid=True, Q=100, tile_width=256, tile_height=256, properties=True)

I'm curious if Q=100 will result in a false image?

Much appreciated.

@jcupitt
Copy link
Member

jcupitt commented Sep 20, 2024

8.15.1 fixed that bug, so Q100 should be fine.

@nahilsobh
Copy link

nahilsobh commented Sep 20, 2024

I used 8.15.3 with this image:
http://merovingio.c2rmf.cnrs.fr/iipimage/PalaisDuLouvre.tif
vips --version
vips-8.15.3
vips im_vips2tiff PalaisDuLouvre.tif output_image.tif:jpeg:90,tile:256x256,pyramid
and got falsely colored image.
Thx.

@nahilsobh
Copy link

Screen Shot 2024-09-20 at 8 46 04 AM

@jcupitt
Copy link
Member

jcupitt commented Sep 20, 2024

I see:

$ vips --version
vips-8.15.1
$ vips im_vips2tiff PalaisDuLouvre.tif output_image.tif:jpeg:90,tile:256x256,pyramid

I can view it in vd and eog:

image

image

Here's the file it generated:

www.rollthepotato.net/~john/output_image.tif

It could be a bug in your image viewer -- try downloading and viewing the version I made.

@jcupitt
Copy link
Member

jcupitt commented Sep 20, 2024

You are using the old vips7 syntax, which is deprecated.

The new CLI syntax is:

$ vips tiffsave PalaisDuLouvre.tif output_image.tif --compression jpeg --Q 90 --tile-width 256 --tile-height 256 --pyramid

You can also write:

$ vips copy PalaisDuLouvre.tif output_image.tif[compression="jpeg",Q=90,tile-width=256,tile-height=256,pyramid]

The new interface is often faster, so it's worth changing over. It's easier to read as well, of course.

@nahilsobh
Copy link

I'm getting this output in qupath
image

@nahilsobh
Copy link

nahilsobh commented Sep 20, 2024

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
img = mpimg.imread("output_image.tif")
plt.imshow(img)
plt.show()

and got this
image

@nahilsobh
Copy link

I tried with Q=89
and got this
image

@nahilsobh
Copy link

I think v 8.15.3 can not take Q>=90 and generate the correct compressed image.

@jcupitt
Copy link
Member

jcupitt commented Sep 20, 2024

Did you download the image I made and test that?

I tried that image in QuPath 0.5 and I see:

image

What version of QuPath are you using?

@nahilsobh
Copy link

image

@jcupitt
Copy link
Member

jcupitt commented Sep 20, 2024

Maybe QuPath on mac is using some system library that can't read these files? In any case, it's clearly a bug in your image viewer, the file is correct.

@nahilsobh
Copy link

nahilsobh commented Sep 20, 2024

But how come the python image viewers showed the same false colors as qupath

@jcupitt
Copy link
Member

jcupitt commented Sep 20, 2024

I suppose it's also picking up a buggy library. Did you download the version of the image that I made?

@jcupitt
Copy link
Member

jcupitt commented Sep 20, 2024

I'll try on my mac.

@nahilsobh
Copy link

Thanks.

@jcupitt
Copy link
Member

jcupitt commented Sep 20, 2024

I tried in macos Preview and the image looks fine. Did you download the version of the image that I made? What do you see with that exact file in macos Preview?

@nahilsobh
Copy link

It displayed fine.
What recommendation to install vips on my mac to reproduce your results?
thx

@jcupitt
Copy link
Member

jcupitt commented Sep 20, 2024

Oh! I tried running the conversion on macos and it made a bad file. Perhaps this is a bug in homebrew libtiff? I'll investigate.

@nahilsobh
Copy link

Thank you.

@jcupitt
Copy link
Member

jcupitt commented Sep 20, 2024

Could you confirm that the file I made also works in your QuPath?

@nahilsobh
Copy link

QuPath displayed the correct colors.

@nahilsobh
Copy link

nahilsobh commented Sep 20, 2024

I made a conda env and installed pyvips and ran the vips CLI and still see the same false colors with Q90
Here are the modules that were used:

packages in environment at /usr/local/Caskroom/miniforge/base/envs/test:

Name Version Build Channel

aom 3.9.1 hf036a51_0 conda-forge
atk-1.0 2.38.0 h4bec284_2 conda-forge
bzip2 1.0.8 hfdf4475_7 conda-forge
c-ares 1.33.1 h44e7173_0 conda-forge
ca-certificates 2024.8.30 h8857fd0_0 conda-forge
cairo 1.18.0 h37bd5c4_3 conda-forge
cffi 1.17.1 py312hf857d28_0 conda-forge
cfitsio 4.4.1 ha105788_0 conda-forge
dav1d 1.2.1 h0dc2134_0 conda-forge
expat 2.6.3 hac325c4_0 conda-forge
fftw 3.3.10 nompi_h292e606_110 conda-forge
font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge
font-ttf-inconsolata 3.000 h77eed37_0 conda-forge
font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge
font-ttf-ubuntu 0.83 h77eed37_2 conda-forge
fontconfig 2.14.2 h5bb23bf_0 conda-forge
fonts-conda-ecosystem 1 0 conda-forge
fonts-conda-forge 1 0 conda-forge
freetype 2.12.1 h60636b9_2 conda-forge
fribidi 1.0.10 hbcb3906_0 conda-forge
gdk-pixbuf 2.42.12 ha587570_0 conda-forge
ghostscript 10.04.0 hac325c4_0 conda-forge
giflib 5.2.2 h10d778d_0 conda-forge
graphite2 1.3.13 h73e2aa4_1003 conda-forge
graphviz 12.0.0 he14ced1_0 conda-forge
gtk2 2.24.33 h2c15c3c_5 conda-forge
gts 0.7.6 h53e17e3_4 conda-forge
harfbuzz 9.0.0 h098a298_1 conda-forge
hdf5 1.14.3 nompi_h687a608_105 conda-forge
icu 75.1 h120a0e1_0 conda-forge
imagemagick 7.1.1_38 pl5321hb0ca40e_0 conda-forge
jbig 2.1 h0d85af4_2003 conda-forge
krb5 1.21.3 h37d8d59_0 conda-forge
lcms2 2.16 ha2f27b4_0 conda-forge
lerc 4.0.0 hb486fe8_0 conda-forge
libaec 1.1.3 h73e2aa4_0 conda-forge
libarchive 3.7.4 h20e244c_0 conda-forge
libasprintf 0.22.5 hdfe23c8_3 conda-forge
libavif16 1.1.1 ha49a9e2_1 conda-forge
libcurl 8.10.1 h58e7537_0 conda-forge
libcxx 19.1.0 hf95d169_0 conda-forge
libde265 1.0.15 h7728843_0 conda-forge
libdeflate 1.21 hfdf4475_0 conda-forge
libdicom 1.0.5 h10d778d_1 conda-forge
libedit 3.1.20191231 h0678c8f_2 conda-forge
libev 4.33 h10d778d_2 conda-forge
libexif 0.6.21 h0d85af4_0 conda-forge
libexpat 2.6.3 hac325c4_0 conda-forge
libffi 3.4.2 h0d85af4_5 conda-forge
libgd 2.3.3 h2e77e4f_10 conda-forge
libgettextpo 0.22.5 hdfe23c8_3 conda-forge
libgfortran 5.0.0 13_2_0_h97931a8_3 conda-forge
libgfortran5 13.2.0 h2873a65_3 conda-forge
libglib 2.80.3 h736d271_2 conda-forge
libheif 1.18.2 gpl_h57a3ca0_100 conda-forge
libhwy 1.1.0 h7728843_0 conda-forge
libiconv 1.17 hd75f5a5_2 conda-forge
libintl 0.22.5 hdfe23c8_3 conda-forge
libjpeg-turbo 3.0.0 h0dc2134_1 conda-forge
libmatio 1.5.27 h74aa911_0 conda-forge
libnghttp2 1.58.0 h64cf6d3_1 conda-forge
libpng 1.6.44 h4b8f8c9_0 conda-forge
librsvg 2.58.4 h2682814_0 conda-forge
libsqlite 3.46.1 h4b8f8c9_0 conda-forge
libssh2 1.11.0 hd019ec5_0 conda-forge
libtiff 4.7.0 h5f227bf_0 conda-forge
libvips 8.15.3 h31d0c09_2 conda-forge
libwebp 1.4.0 hc207709_0 conda-forge
libwebp-base 1.4.0 h10d778d_0 conda-forge
libxcb 1.16 h00291cd_1 conda-forge
libxml2 2.12.7 heaf3512_4 conda-forge
libzlib 1.3.1 h87427d6_1 conda-forge
llvm-openmp 18.1.8 h15ab845_1 conda-forge
lz4-c 1.9.4 hf0c8a7f_0 conda-forge
lzo 2.10 h10d778d_1001 conda-forge
ncurses 6.5 hf036a51_1 conda-forge
nspr 4.35 hea0b92c_0 conda-forge
nss 3.104 h3135457_0 conda-forge
openjpeg 2.5.2 h7310d3a_0 conda-forge
openslide 4.0.0 h75f8748_1 conda-forge
openssl 3.3.2 hd23fc13_0 conda-forge
pango 1.54.0 h115fe74_2 conda-forge
pcre2 10.44 h7634a1b_2 conda-forge
perl 5.32.1 7_h10d778d_perl5 conda-forge
pip 24.2 pyh8b19718_1 conda-forge
pixman 0.43.4 h73e2aa4_0 conda-forge
pkg-config 0.29.2 hf7e621a_1009 conda-forge
pkgconfig 1.5.5 pyhd8ed1ab_4 conda-forge
poppler 24.08.0 h65860a0_1 conda-forge
poppler-data 0.4.12 hd8ed1ab_0 conda-forge
pthread-stubs 0.4 h00291cd_1002 conda-forge
pycparser 2.22 pyhd8ed1ab_0 conda-forge
python 3.12.5 h37a9e06_0_cpython conda-forge
python_abi 3.12 5_cp312 conda-forge
pyvips 2.2.3 py312hdb907c9_1 conda-forge
rav1e 0.6.6 h7205ca4_2 conda-forge
readline 8.2 h9e318b2_1 conda-forge
setuptools 74.1.2 pyhd8ed1ab_0 conda-forge
svt-av1 2.2.1 hac325c4_0 conda-forge
tk 8.6.13 h1abcd95_1 conda-forge
tzdata 2024a h8827d51_1 conda-forge
wheel 0.44.0 pyhd8ed1ab_0 conda-forge
x265 3.5 hbb4e6a2_3 conda-forge
xorg-kbproto 1.0.7 h00291cd_1003 conda-forge
xorg-libice 1.1.1 h0dc2134_0 conda-forge
xorg-libsm 1.2.4 h0dc2134_0 conda-forge
xorg-libx11 1.8.9 h7022169_1 conda-forge
xorg-libxau 1.0.11 h0dc2134_0 conda-forge
xorg-libxdmcp 1.1.3 h35c211d_0 conda-forge
xorg-libxext 1.3.4 hb7f2c08_2 conda-forge
xorg-libxrender 0.9.11 h0dc2134_0 conda-forge
xorg-libxt 1.3.0 h0dc2134_1 conda-forge
xorg-renderproto 0.11.1 h00291cd_1003 conda-forge
xorg-xextproto 7.3.0 h00291cd_1004 conda-forge
xorg-xproto 7.0.31 h00291cd_1008 conda-forge
xz 5.2.6 h775f41a_0 conda-forge
zlib 1.3.1 h87427d6_1 conda-forge
zstd 1.5.6 h915ae27_0 conda-forge

@nahilsobh
Copy link

nahilsobh commented Sep 20, 2024

with Q=89 it displayed the image correctly (but not the Q=90)using the conda env for pyvips.

jcupitt added a commit to libvips/libvips that referenced this issue Sep 20, 2024
JPEG in TIFF compression needs the jpeg encoding colourspace set to
match the enclosing tiff file.

fixes regression in 8.15.3 from #3924

see libvips/pyvips#490

reproduce error with

```
vips copy x.jpg x.tif[compression=jpeg,Q=90,pyramid]
```

thanks nahilsobh
@jcupitt
Copy link
Member

jcupitt commented Sep 20, 2024

There was a bug introduced in 8.15.3 where the colourspace for jpeg tiles didn't match the photometric interpretation in the enclosing tiff. I've made a PR and credited you, the fix should be in 8.15.4.

Thanks for reporting this dumb thing!

@nahilsobh
Copy link

No worries. I'm glad you found it. Thanks for all your efforts.

jcupitt added a commit to libvips/libvips that referenced this issue Sep 21, 2024
* call jpeg_set_colorspace for jpeg in tiff

JPEG in TIFF compression needs the jpeg encoding colourspace set to
match the enclosing tiff file.

fixes regression in 8.15.3 from #3924

see libvips/pyvips#490

reproduce error with

```
vips copy x.jpg x.tif[compression=jpeg,Q=90,pyramid]
```

thanks nahilsobh

Co-authored-by: Kleis Auke Wolthuizen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants