You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is a summary on some exploratory work with using GPU for some computations in satpy.
Setup
Using a SAR-C scene from sentinel 1, the idea was to try to speed up image generation without any resampling.
This data is quite complex in that it need to create full arrays of noise estimations for denoising from sparse arrays.
The data is about 20000x10000 pixels.
To work on this, we used a gpu-enabled server, and created a new environment, that includes cupy, cupy-xarray, satpy and trollimage.
Cupy had to be build from the source of the main branch as some features like interpolation where not available otherwise.
Satpy and Trollimage where installed in edit mode.
Experiment
We switched on the usage of cupy subarrays from dask with: dask.config.set({"array.backend": "cupy"})
Then, multiple places where adjust to make use of cupy when available.
Satpy didn't need any modifications anywhere else than in the reader.
Trollimage needed some small adjustments in the enhancement part to ensure cupy arrays were preserved.
Note that reading the data was done in the CPU, as was the writing.
The test script that reads the data, generates the composite and saves it to disk was called with: GDAL_NUM_THREADS=ALL_CPUS DASK_NUM_WORKERS=8 DASK_ARRAY__CHUNK_SIZE=32MB /usr/bin/time -v python test_s1.py
Results
The data was read, composited and written without problems as far as execution goes.
The performance was however slower than in the CPU case, 45 seconds instead of 40 seconds. The memory usage dropped though by 15% maybe in the GPU case, and did free the CPU (quite logically)
# on GPU
User time (seconds): 156.38
System time (seconds): 9.59
Percent of CPU this job got: 362%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:45.83
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 3819596
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 2351368
Voluntary context switches: 131790
Involuntary context switches: 33327
Swaps: 0
File system inputs: 0
File system outputs: 513936
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
on CPU
User time (seconds): 252.34
System time (seconds): 15.85
Percent of CPU this job got: 655%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:40.94
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 4534916
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 1226254
Voluntary context switches: 134968
Involuntary context switches: 61213
Swaps: 0
File system inputs: 0
File system outputs: 639520
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Reflexion
It is likely that the decreased perfomance is due to the overhead of reading the data in CPU, then transfering it to GPU for computation, and then transfering it back to CPU for writing. The processing of the image data is probably to light to have a significant impact when processed with the GPU.
At the time of writing, we weren't able to find python libraries that would allow use to read or write geotiff files (input was also geotiff) directly into GPU. It is possible though, and to go further on this topic we should check the kvikio engine for xarray that can read zarr data to gpu directly (see example here: https://xarray.dev/blog/xarray-kvikio).
Rasterio uses cython to load the data into a numpy array. At the moment, cupy does not have a stable or documented cython interface.
Another solution would be to go directly through the GPU Data Storage (GDS) interface, like kvikio is doing.
The text was updated successfully, but these errors were encountered:
This issue is a summary on some exploratory work with using GPU for some computations in satpy.
Setup
Using a SAR-C scene from sentinel 1, the idea was to try to speed up image generation without any resampling.
This data is quite complex in that it need to create full arrays of noise estimations for denoising from sparse arrays.
The data is about 20000x10000 pixels.
To work on this, we used a gpu-enabled server, and created a new environment, that includes cupy, cupy-xarray, satpy and trollimage.
Cupy had to be build from the source of the main branch as some features like interpolation where not available otherwise.
Satpy and Trollimage where installed in edit mode.
Experiment
We switched on the usage of cupy subarrays from dask with:
dask.config.set({"array.backend": "cupy"})
Then, multiple places where adjust to make use of cupy when available.
Satpy didn't need any modifications anywhere else than in the reader.
Trollimage needed some small adjustments in the enhancement part to ensure cupy arrays were preserved.
Note that reading the data was done in the CPU, as was the writing.
The test script that reads the data, generates the composite and saves it to disk was called with:
GDAL_NUM_THREADS=ALL_CPUS DASK_NUM_WORKERS=8 DASK_ARRAY__CHUNK_SIZE=32MB /usr/bin/time -v python test_s1.py
Results
The data was read, composited and written without problems as far as execution goes.
The performance was however slower than in the CPU case, 45 seconds instead of 40 seconds. The memory usage dropped though by 15% maybe in the GPU case, and did free the CPU (quite logically)
on CPU
Reflexion
It is likely that the decreased perfomance is due to the overhead of reading the data in CPU, then transfering it to GPU for computation, and then transfering it back to CPU for writing. The processing of the image data is probably to light to have a significant impact when processed with the GPU.
At the time of writing, we weren't able to find python libraries that would allow use to read or write geotiff files (input was also geotiff) directly into GPU. It is possible though, and to go further on this topic we should check the kvikio engine for xarray that can read zarr data to gpu directly (see example here: https://xarray.dev/blog/xarray-kvikio).
Rasterio uses cython to load the data into a numpy array. At the moment, cupy does not have a stable or documented cython interface.
Another solution would be to go directly through the GPU Data Storage (GDS) interface, like kvikio is doing.
The text was updated successfully, but these errors were encountered: