Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unrelated updates in overlay with dmap triggers computation of another dmap #6135

Open
ahuang11 opened this issue Feb 27, 2024 · 6 comments
Open
Labels
type: enhancement Minor feature or improvement to an existing feature

Comments

@ahuang11
Copy link
Collaborator

ahuang11 commented Feb 27, 2024

import xarray as xr
import panel as pn
import dask.array as da

pn.extension(throttled=True)
import holoviews as hv
from holoviews.operation.datashader import rasterize
from holoviews.streams import RangeXY

hv.extension("bokeh")

DATA_ARRAY = "10000frames"

# create fake dask array
data = da.random.random((100000, 100, 100), chunks=(100, 100, 100))
data = xr.DataArray(data, dims=["frame", "height", "width"])

FRAMES_PER_SECOND = 30
FRAMES = data.coords["frame"].values


def plot_image(value):
    return hv.Image(data.sel(frame=value), kdims=["width", "height"]).opts(
        cmap="Viridis",
        frame_height=400,
        frame_width=400,
        colorbar=False,
    )

# Create a video player widget
video_player = pn.widgets.Player(
    length=len(data.coords["frame"]),
    interval=1000 // FRAMES_PER_SECOND,  # ms
    value=int(FRAMES.min()),
    max_width=400,
    max_height=90,
    loop_policy="loop",
    sizing_mode="stretch_width",
)

# Create the main plot
main_plot = hv.DynamicMap(
    plot_image, kdims=["value"], streams=[video_player.param.value]
)

# frame indicator lines on side plots
line_opts = dict(color="red", alpha=0.6, line_width=3)
dmap_vline = hv.DynamicMap(
    pn.bind(lambda value: hv.VLine(value), video_player)
).opts(
    **line_opts
)


# height side view
right_data = data.mean(["width"])
right_plot = rasterize(
    hv.Image(right_data, kdims=["frame", "height"]).opts(
        cmap="Viridis",
        frame_height=400,
        frame_width=200,
        colorbar=False,
        title="_",
    ),
    streams=[RangeXY()],
)

sim_app = pn.Column(
    video_player,
    pn.Row(main_plot, right_plot * dmap_vline),
)

sim_app

If this is loaded, things move much quicker

right_data = data.mean(["width"]).load()
@ahuang11 ahuang11 added the type: enhancement Minor feature or improvement to an existing feature label Feb 27, 2024
@ahuang11 ahuang11 changed the title Unrelated updates in overlay triggers computation of unrelated plot Unrelated updates in overlay with dmap triggers computation of another dmap Feb 27, 2024
@philippjfr
Copy link
Member

I was a little surprised, the caching on the right_plot is working just fine and since the plot only sees the rasterized data it should never attempt to recompute the actual data. So what I think is happening is that datashader is applying the regridding lazily so whenever anything on the right_plot is recomputed it has to go all the way back to the raw underlying data, apply the reduction and then apply the regridding.

So based on that everything is working correctly here, so I think the only real fix to apply here is to warn users that if they pass a non-persisted Dask backed array to rasterize/regrid then the interactive performance will be significantly degraded. Alternatively rasterize should automatically persist the regridded result.

@ahuang11
Copy link
Collaborator Author

@philippjfr
Copy link
Member

philippjfr commented Feb 27, 2024

Some timings, indicating it's pretty much entirely range calculations.

No .persist()

         685761 function calls (681648 primitive calls) in 2.409 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.409    2.409 {built-in method builtins.exec}
        1    0.000    0.000    2.409    2.409 <string>:1(<module>)
    365/1    0.000    0.000    2.409    2.409 parameterized.py:518(_f)
      7/1    0.000    0.000    2.409    2.409 parameters.py:533(__set__)
    335/1    0.001    0.000    2.409    2.409 parameterized.py:1443(__set__)
        3    0.000    0.000    2.409    0.803 parameterized.py:2473(_call_watcher)
        3    0.000    0.000    2.409    0.803 parameterized.py:2456(_execute_watcher)
        2    0.000    0.000    2.408    1.204 streams.py:774(_watcher)
        2    0.000    0.000    2.408    1.204 streams.py:149(trigger)
        2    0.000    0.000    2.407    1.204 plot.py:210(refresh)
        2    0.000    0.000    2.407    1.203 plot.py:252(_trigger_refresh)
        2    0.000    0.000    2.407    1.203 plot.py:953(update)
        2    0.000    0.000    2.407    1.203 plot.py:434(__getitem__)
    16/14    0.000    0.000    2.371    0.169 __init__.py:186(pipelined_fn)
        1    0.000    0.000    2.357    2.357 element.py:2995(update_frame)
        3    0.000    0.000    2.354    0.785 base.py:605(compute)
        2    0.000    0.000    2.351    1.176 plot.py:574(compute_ranges)
        3    0.000    0.000    2.351    0.784 plot.py:692(_compute_group_range)
        6    0.000    0.000    2.346    0.391 raster.py:496(range)
        2    0.000    0.000    2.345    1.173 __init__.py:488(range)
        2    0.000    0.000    2.345    1.172 xarray.py:287(range)
        3    0.001    0.000    2.270    0.757 threaded.py:36(get)
        3    0.014    0.005    2.269    0.756 local.py:350(get_async)
     2078    0.001    0.000    2.125    0.001 local.py:136(queue_get)
     2078    0.008    0.000    2.123    0.001 queue.py:154(get)
     1210    0.006    0.000    2.107    0.002 threading.py:280(wait)
     6918    2.098    0.000    2.098    0.000 {method 'acquire' of '_thread.lock' objects}

With .persist()

         586664 function calls (583555 primitive calls) in 0.272 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.272    0.272 {built-in method builtins.exec}
        1    0.000    0.000    0.272    0.272 <string>:1(<module>)
    365/1    0.000    0.000    0.272    0.272 parameterized.py:518(_f)
      7/1    0.000    0.000    0.272    0.272 parameters.py:533(__set__)
    335/1    0.002    0.000    0.272    0.272 parameterized.py:1443(__set__)
        3    0.000    0.000    0.272    0.091 parameterized.py:2473(_call_watcher)
        3    0.000    0.000    0.272    0.091 parameterized.py:2456(_execute_watcher)
        2    0.000    0.000    0.270    0.135 streams.py:774(_watcher)
        2    0.000    0.000    0.270    0.135 streams.py:149(trigger)
        2    0.000    0.000    0.270    0.135 plot.py:210(refresh)
        2    0.000    0.000    0.268    0.134 plot.py:252(_trigger_refresh)
        2    0.000    0.000    0.268    0.134 plot.py:953(update)
        2    0.000    0.000    0.268    0.134 plot.py:434(__getitem__)
    16/14    0.000    0.000    0.229    0.016 __init__.py:186(pipelined_fn)
        1    0.000    0.000    0.215    0.215 element.py:2995(update_frame)
        2    0.000    0.000    0.209    0.105 plot.py:574(compute_ranges)
        3    0.000    0.000    0.209    0.070 plot.py:692(_compute_group_range)
        6    0.000    0.000    0.202    0.034 raster.py:496(range)
        3    0.000    0.000    0.202    0.067 base.py:605(compute)
        2    0.000    0.000    0.201    0.101 __init__.py:488(range)
        2    0.000    0.000    0.201    0.100 xarray.py:287(range)
        3    0.001    0.000    0.153    0.051 threaded.py:36(get)
        3    0.005    0.002    0.152    0.051 local.py:350(get_async)
     2070    0.001    0.000    0.075    0.000 local.py:136(queue_get)
     2070    0.003    0.000    0.074    0.000 queue.py:154(get)
      260    0.001    0.000    0.068    0.000 threading.py:280(wait)
     3121    0.067    0.000    0.067    0.000 {method 'acquire' of '_thread.lock' objects}

@ahuang11
Copy link
Collaborator Author

Alternatively rasterize should automatically persist the regridded result.

I think this would be ideal since

if they pass a non-persisted Dask backed array to rasterize/regrid then the interactive performance will be significantly degraded

Is not possible in some cases if the data is much larger

@philippjfr
Copy link
Member

Alternatively rasterize should automatically persist the regridded result.

Unfortunately I found this to effectively make zero difference.

@droumis
Copy link
Member

droumis commented Jun 24, 2024

@philippjfr, is there a potential solution for a larger non-persisted Dask-backed array to make use of rasterize in this context?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement Minor feature or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

3 participants