Investigations on using GPU for computations in satpy #2990

mraspaud · 2024-11-20T10:51:20Z

This issue is a summary on some exploratory work with using GPU for some computations in satpy.

Setup

Using a SAR-C scene from sentinel 1, the idea was to try to speed up image generation without any resampling.
This data is quite complex in that it need to create full arrays of noise estimations for denoising from sparse arrays.
The data is about 20000x10000 pixels.

To work on this, we used a gpu-enabled server, and created a new environment, that includes cupy, cupy-xarray, satpy and trollimage.

Cupy had to be build from the source of the main branch as some features like interpolation where not available otherwise.

Satpy and Trollimage where installed in edit mode.

Experiment

We switched on the usage of cupy subarrays from dask with:
dask.config.set({"array.backend": "cupy"})

Then, multiple places where adjust to make use of cupy when available.
Satpy didn't need any modifications anywhere else than in the reader.
Trollimage needed some small adjustments in the enhancement part to ensure cupy arrays were preserved.

Note that reading the data was done in the CPU, as was the writing.

The test script that reads the data, generates the composite and saves it to disk was called with:
GDAL_NUM_THREADS=ALL_CPUS DASK_NUM_WORKERS=8 DASK_ARRAY__CHUNK_SIZE=32MB /usr/bin/time -v python test_s1.py

Results

The data was read, composited and written without problems as far as execution goes.
The performance was however slower than in the CPU case, 45 seconds instead of 40 seconds. The memory usage dropped though by 15% maybe in the GPU case, and did free the CPU (quite logically)

# on GPU User time (seconds): 156.38 System time (seconds): 9.59 Percent of CPU this job got: 362% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:45.83 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 3819596 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 2351368 Voluntary context switches: 131790 Involuntary context switches: 33327 Swaps: 0 File system inputs: 0 File system outputs: 513936 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096

on CPU

    User time (seconds): 252.34
    System time (seconds): 15.85
    Percent of CPU this job got: 655%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:40.94
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 4534916
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 1226254
    Voluntary context switches: 134968
    Involuntary context switches: 61213
    Swaps: 0
    File system inputs: 0
    File system outputs: 639520
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096

Reflexion

It is likely that the decreased perfomance is due to the overhead of reading the data in CPU, then transfering it to GPU for computation, and then transfering it back to CPU for writing. The processing of the image data is probably to light to have a significant impact when processed with the GPU.

At the time of writing, we weren't able to find python libraries that would allow use to read or write geotiff files (input was also geotiff) directly into GPU. It is possible though, and to go further on this topic we should check the kvikio engine for xarray that can read zarr data to gpu directly (see example here: https://xarray.dev/blog/xarray-kvikio).
Rasterio uses cython to load the data into a numpy array. At the moment, cupy does not have a stable or documented cython interface.
Another solution would be to go directly through the GPU Data Storage (GDS) interface, like kvikio is doing.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigations on using GPU for computations in satpy #2990

Investigations on using GPU for computations in satpy #2990

mraspaud commented Nov 20, 2024

on CPU

Investigations on using GPU for computations in satpy #2990

Investigations on using GPU for computations in satpy #2990

Comments

mraspaud commented Nov 20, 2024

Setup

Experiment

Results

on CPU

Reflexion