Struggling with dask delayed to parallelize tidal analysis #194

rsignell-usgs · 2018-04-03T23:25:35Z

I told @rabernat I'd post this on SO, but the simplified test I made to isolate the problem for SO ended up working, so I'm posting here in case the pangeo env or workflow has something to do with the issue.

I'm trying to use dask delayed to do an embarrassingly parallel problem: apply a time series function (tidal analysis) to each (lat, lon) location in a 3D (time, lat, lon) data cube.

This simple example worked fine:

import dask.array as da
from dask import delayed
import numpy as np
from dask.distributed import Client, LocalCluster

def mysin(x):
    return np.sin(x)

dsin = delayed(mysin)
z = np.random.random((50,100,100))

cluster = LocalCluster(n_workers=3,ncores=3)
client = Client(cluster)

nt, n, m = z.shape
zr = [(dsin(z[:,j,i])) for j in range(n) for i in range(m)]
total = delayed(zr).compute()

but my tide notebook with tidal analysis on model output loaded from Zarr looks good up to the line:

%time total = delayed(coefs).compute()

where the kernel dies.

The logs look good just before the kernel dies, but then I can't access the logs since the kernel has died. 😢

Does anyone see anything obviously wrong with my tide notebook?

The text was updated successfully, but these errors were encountered:

ah- · 2018-04-03T23:34:50Z

Haven't looked in detail, but it seems like you might have a huge number of delayed tasks?

If so, it will probably run much better with fewer more coarse tasks. Or you might be able to use dask array/xarray to do the chunking for you?

rsignell-usgs · 2018-04-03T23:55:06Z

@ah, yes, there are about 125K delayed tasks. But I'm not sure how to make the task coarser. The tidal analysis function can't take larger chunks, like multiple time series at once, so I need to run it at each lon,lat location. And there are lots (125K) of lon,lat locations.

rsignell-usgs · 2018-04-04T03:44:55Z

So if I subsample the grid with nsub=4 instead of nsub=2 (35K delayed tasks), the kernel doesn't die, and the notebook works, producing the map of M2 amplitude below.

What controls whether the kernel will die (e.g. how many delayed tasks is "huge")?
What would I need to do differently with my workflow to get Dask to work with nsub=2 ?

pbranson · 2018-04-04T05:52:12Z

I am only just getting into dask/xarray so more experienced people may have a better suggestion, but it looks like one of these approaches should assist: https://xarray.pydata.org/en/stable/generated/xarray.apply_ufunc.html#xarray.apply_ufunc or http://dask.pydata.org/en/latest/array-api.html#dask.array.apply_along_axis So rather than your section which does a list comprehension: coefs = [usolve(t=t, u=z[:,j,i], v=None, lat=lat, trend=False, nodal=False, verbose=False, Rayleigh_min=0.95, method='ols', conf_int='linear') for j in range(jj) for i in range(ii)] Open the dataset with the 'chunks' parameter set for lat and lon to something reasonable that partitions your problem across the available processors and then using one of those ufunc approaches it should take care of the scheduling transparently to process your dataset rather than flooding it with a new function call for each grid cooordinate... I think! I think your question may pertain more to dask usage rather than pangeo specifically...

…

On Wed, Apr 4, 2018 at 11:44 AM, Rich Signell ***@***.***> wrote: So if I subsample the grid with nsub=4 instead of nsub=2 ( 35K delayed tasks), the kernel doesn't die, and the notebook works. What controls whether the kernel will die (e.g. how many delayed tasks is "huge")? What would I need to do differently with my workflow to get Dask to work with nsub=2 ? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#194 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AM3bQKfECBDEfneLZDjvlNN3-LzHwn9eks5tlEG5gaJpZM4TF79C> .

mrocklin · 2018-04-04T11:59:56Z

Each task takes up about 10kB and a millisecond of overhead

In [1]: from dask.distributed import Client, wait

In [2]: from distributed.utils import format_bytes

In [3]: import psutil

In [4]: client = Client()

In [5]: start = psutil.Process().memory_info().rss

In [6]: format_bytes(start)
Out[6]: '104.30 MB'

In [7]: def inc(x):
   ...:     return x + 1
   ...: 

In [8]: %time futures = client.map(inc, range(100000))
CPU times: user 3.41 s, sys: 72.8 ms, total: 3.48 s
Wall time: 3.49 s

In [9]: %time _ = wait(futures)
CPU times: user 32.4 s, sys: 235 ms, total: 32.6 s
Wall time: 33.1 s

In [10]: end = psutil.Process().memory_info().rss

In [11]: format_bytes(end)
Out[11]: '1.07 GB'

In [12]: format_bytes((end - start) / len(futures))
Out[12]: '9.64 kB'

If your tasks deal with times that are shorter than this then you might consider bundling several such operations within a single task, perhaps using a for loop within your function.

Typically I watch the diagnostic dashboard and, if I see a lot of white space in the task stream plot (the central plot on the status page) and only sporadic thin vertical bars then that is a sign that my system is spending most of its time in scheduling overhead (the 1ms cost) rather than actual computation. See dashboard video at this time

rsignell-usgs · 2018-04-04T14:55:09Z

@mrocklin, I checked out the dashboard, and it seems that dask with 35k delayed tasks is actually working very nicely (not too much white space):

So I then ran the tidal analysis in serial mode to see how much slower it was compared to my Dask workflow, and it took 1 hour 15 min instead of 1min 52 s. That's a speedup of 40, so perfect linear speedup with the 40 cpus in my Dask cluster:

So I'm pretty happy with the workflow. I just wished it worked with 100K or 500K tasks also.

mrocklin · 2018-04-04T14:57:37Z

You might check the System tab to watch your Scheduler's memory use during execution. This will help you to understand if you're reaching a memory limit on your notebook/scheduler pod due to having too much administrative data. 10kB per task * 100k tasks is around a GB. I think that notebook pods have something like 2-4GB?

…

On Wed, Apr 4, 2018 at 10:55 AM, Rich Signell ***@***.***> wrote: @mrocklin <https://github.com/mrocklin>, I checked out the dashboard, and it seems that dask with 35k delayed tasks is actually working very nicely (not too much white space): [image: 2018-04-04_8-43-17] <https://user-images.githubusercontent.com/1872600/38315384-60cba886-37f6-11e8-9aca-69087b05c8b2.png> So I then ran the tidal analysis in serial mode to see how much slower it was compared to my Dask workflow, and it took 1 hour 15 min instead of 1min 52 s. That's a speedup of 40, so perfect linear speedup with the 40 cpus in my Dask cluster: [image: 2018-04-04_10-14-35] <https://user-images.githubusercontent.com/1872600/38315394-67a2fb32-37f6-11e8-84ce-55990d7239df.png> So I'm pretty happy with the workflow. I just wished it worked with 100K or 500K tasks also. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#194 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AASszCfVhpQKiI2Flh0oGZjM6tSQRA6fks5tlN7OgaJpZM4TF79C> .

mrocklin · 2018-04-04T14:58:17Z

Or, more generally, I don't know why things are crashing. You'll probably have to investigate typical things like "did I run out of memory?" to diagnose further. On Wed, Apr 4, 2018 at 10:57 AM, Matthew Rocklin <[email protected]> wrote:

…

You might check the System tab to watch your Scheduler's memory use during execution. This will help you to understand if you're reaching a memory limit on your notebook/scheduler pod due to having too much administrative data. 10kB per task * 100k tasks is around a GB. I think that notebook pods have something like 2-4GB? On Wed, Apr 4, 2018 at 10:55 AM, Rich Signell ***@***.***> wrote: > @mrocklin <https://github.com/mrocklin>, I checked out the dashboard, > and it seems that dask with 35k delayed tasks is actually working very > nicely (not too much white space): > > [image: 2018-04-04_8-43-17] > <https://user-images.githubusercontent.com/1872600/38315384-60cba886-37f6-11e8-9aca-69087b05c8b2.png> > > So I then ran the tidal analysis in serial mode to see how much slower it > was compared to my Dask workflow, and it took 1 hour 15 min instead of 1min > 52 s. That's a speedup of 40, so perfect linear speedup with the 40 cpus in > my Dask cluster: > > [image: 2018-04-04_10-14-35] > <https://user-images.githubusercontent.com/1872600/38315394-67a2fb32-37f6-11e8-84ce-55990d7239df.png> > > So I'm pretty happy with the workflow. I just wished it worked with 100K > or 500K tasks also. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#194 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AASszCfVhpQKiI2Flh0oGZjM6tSQRA6fks5tlN7OgaJpZM4TF79C> > . >

rsignell-usgs · 2018-04-04T16:07:31Z

@mrocklin, looks like the kernel dies when the memory on the "system" tab hits about 3.7GB:

It looks like the myworker.yml config I'm using has a 6GB limit:

$jovyan@jupyter-rsignell-2dusgs:~$ more myworker.yml
metadata:
spec:
  restartPolicy: Never
  containers:
  - args:
      - dask-worker
      - --nthreads
      - '2'
      - --no-bokeh
      - --memory-limit
      - 6GB
      - --death-timeout
      - '60'
    image: pangeo/worker:2018-03-28
    name: dask-worker
    securityContext:
      capabilities:
        add: [SYS_ADMIN]
      privileged: true
    env:
      - name: GCSFUSE_BUCKET
        value: pangeo-data
      - name: EXTRA_CONDA_PACKAGES
        value: utide -c conda-forge
    resources:
      limits:
        cpu: "1.75"
        memory: 6G
      requests:
        cpu: "1.75"
        memory: 6G

Does this info point to memory being the reason the kernel dies?
If so, can I just increase the limits in myworker.yml?

mrocklin · 2018-04-04T16:45:14Z

That is the memory limit for your workers. Your scheduler is running in the same process as your notebook (see cluster.scheduler) which I think has a 4GB limit as defined in the JupyterHub configuration.

…

On Wed, Apr 4, 2018 at 12:07 PM, Rich Signell ***@***.***> wrote: @mrocklin <https://github.com/mrocklin>, looks like the kernel dies when the memory on the "system" tab hits about 3.7GB: [image: 2018-04-04_11-45-40] <https://user-images.githubusercontent.com/1872600/38318976-b58aa914-37fe-11e8-9328-6b25092bc31f.png> It looks like the myworker.yml config I'm using has a 6GB limit: ***@***.***:~$ more myworker.ymlmetadata:spec: restartPolicy: Never containers: - args: - dask-worker - --nthreads - '2' - --no-bokeh - --memory-limit - 6GB - --death-timeout - '60' image: pangeo/worker:2018-03-28 name: dask-worker securityContext: capabilities: add: [SYS_ADMIN] privileged: true env: - name: GCSFUSE_BUCKET value: pangeo-data - name: EXTRA_CONDA_PACKAGES value: utide -c conda-forge resources: limits: cpu: "1.75" memory: 6G requests: cpu: "1.75" memory: 6G Does this info point to memory being the reason the kernel dies? If so, can I just increase the limits in myworker.yml? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#194 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AASszBHkDKL3G1IM97fw9-RsI85BjaNdks5tlO_EgaJpZM4TF79C> .

stale · 2018-06-25T16:15:46Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2018-07-02T16:38:21Z

This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.

rsignell-usgs · 2020-01-07T17:00:42Z

I'm revisiting this topic to give a summary of how I eventually addressed this issue. I've been able to make it work by splitting the large (500,000) list of dask tasks into 30,000 task chunks, and letting Dask work on each chunk in series. This way Dask stays busy but we don't run out of scheduler memory.

Each chunk takes about 3.25 minutes to run, with 2 minutes taken by the scheduler, then 1.5 minutes by the workers (with 120cpus) to do the tasks. The tasks each take about 300ms to run. In the Dask Documentation it says:

The scheduler adds about one millisecond of overhead per task or Future object. While this may sound fast it’s quite slow if you run a billion tasks. If your functions run faster than 100ms or so then you might not see any speedup from using distributed computing.

30,000 tasks taking 2 minutes in the scheduler works out to 4 milliseconds per task. Obviously it's not too optimal to be spending 60% of our time in the scheduler, and it would be great to be able to follow this advice from the Dask documentation to create fewer, longer running tasks:

A common solution is to batch your input into larger chunks.

Each task, however, is a call to a tidal analysis program with a single time series as input, so it would seem that we would need to modify the code for this program to accept multiple time series if we are to see any additional benefit.

Does this seem like an accurate assessment?

dcherian · 2020-01-07T18:57:09Z

(500,000) list of dask tasks

It looks like you are effectively chunking your dataset as each chunk being size 1 in lat, lon.

If so, you could use apply_ufunc(vectorize=True) to process larger chunks. That will still be slow but you could use numba.guvectorize instead of vectorize=True to (maybe) go faster. This example notebook may help: pydata/xarray#3629 (comments on the notebook are welcome)

guillaumeeb · 2020-01-07T20:03:12Z

Have you seen https://examples.dask.org/applications/embarrassingly-parallel.html#Handling-very-large-simulation-with-Bags ?

I'm not sure this applies to your problem though, as your input data is more complex than in the example. Trying to launch all at once would probably overwhelm the scheduler as you experienced...

rsignell-usgs · 2020-01-07T22:08:04Z

Thanks @dcherian and @guillaumeeb. I'm going to try recasting using that bag approach and I will report back. Also, just for reference, here is my current notebook solution.

mrocklin · 2020-01-07T22:28:11Z

I'd love to see a performance report if you have time to generate one. https://docs.dask.org/en/latest/diagnostics-distributed.html#capture-diagnostics

…

On Tue, Jan 7, 2020, 2:08 PM Rich Signell ***@***.***> wrote: Thanks @dcherian <https://github.com/dcherian> and @guillaumeeb <https://github.com/guillaumeeb>. I'm going to try recasting using that bag approach and I will report back. Also, just for reference, here is my current notebook solution <https://nbviewer.jupyter.org/gist/rsignell-usgs/508e3763538877a3189e07ff2932a6b8> . — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#194?email_source=notifications&email_token=AACKZTG7TQD2VIKJ2LQR6UDQ4T4MLA5CNFSM4EYXX5BKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIKOTGQ#issuecomment-571795866>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTGSRNMH6RJ2Q4MDW6DQ4T4MLANCNFSM4EYXX5BA> .

rsignell-usgs · 2020-01-08T14:07:18Z

@mrocklin , cool, I did not know about this capability. Here's the dask performance report you requested!

mrocklin · 2020-01-08T21:52:57Z

Hrm, the scheduler and worker administrative profiles are blank. That's odd. Have you by any chance turned off profiling?

…

On Wed, Jan 8, 2020 at 6:07 AM Rich Signell ***@***.***> wrote: @mrocklin <https://github.com/mrocklin> , cool, I did not know about this capability. Here's the dask performance report <https://nbviewer.jupyter.org/github/rsignell-usgs/testing/blob/master/dask-report.html> you requested! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#194?email_source=notifications&email_token=AACKZTDTAKCTPI62AGS7OVDQ4XMZPA5CNFSM4EYXX5BKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIMRSBQ#issuecomment-572070150>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTAHHVC4XH4WRZDWFELQ4XMZPANCNFSM4EYXX5BA> .

rsignell-usgs · 2020-01-09T12:59:29Z

@mrocklin, I regenerated the dask performance report , and this time the scheduler and admin profiles are not blank!

mrocklin · 2020-01-09T15:33:50Z

Thanks @rsignell-usgs ! I guess what we're seeing there is that the scheduler and worker administrative threads just don't seem to be busy at all. They don't seem to be under load. You might add the following to your config to make sure that communication costs end up on the main thread, and see if that changes things qualitatively.

distributed
  comm:
    offload: False

Looking at your task stream plot, I also notice that you're spending a lot of time in inv. If this is something like np.linalg.inv I would encourage you to look at using a solve function instead if that makes sense. They generally do the same work, but are faster and more stable.

rsignell-usgs · 2020-01-10T21:54:01Z

Thanks @mrocklin, I'll check both of those things out. I'm pretty sure there is no reason to invert the matrix. So for the communications, just to make sure, I just add

distributed:
  comm:
    offload: False

to the end of my ~/.dask/config.yaml file?

rsignell-usgs · 2020-01-10T21:58:40Z

And thanks @guillaumeeb for the tip on dask bag for this workflow. I can now run 20% faster and don't have to create ugly loops or wait for the giant list of delayed functions to get created! The bag approach is in cells [17-25] of this new notebook.

guillaumeeb · 2020-01-11T00:10:18Z

Glad to hear that ! Just one question though, why are you using 60000 partitions ? That's quite a lot.

mrocklin · 2020-01-11T02:32:41Z

You'll want to include that in any other config you already have. You might already have a `distributed:` section, so you would want to add the comm section to that (assuming that you don't already have one).

…

On Fri, Jan 10, 2020 at 1:54 PM Rich Signell ***@***.***> wrote: Thanks @mrocklin <https://github.com/mrocklin>, I'll check both of those things out. I'm pretty sure there is no reason to invert the matrix. So for the communications, just to make sure, I just add distributed: comm: offload: False to the end of my ~/.dask/config.yaml file? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#194?email_source=notifications&email_token=AACKZTARKNIGL47RC3EEGETQ5DU7ZA5CNFSM4EYXX5BKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIVKJAI#issuecomment-573219969>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTC3KAZSA62IGVIUCS3Q5DU7ZANCNFSM4EYXX5BA> .

rsignell-usgs · 2020-01-11T13:07:49Z

@guillaumeeb, I tried 6,000 partitions instead of 60,000 partitions for my 500,000 tasks, which shortened my run time by only 4%, but it made my performance report 10 times smaller. Thanks for the idea! 😸

@mrocklin, adding those lines to the dask config indeed did the trick. Now I have filled-in admin tabs in my new smaller Dask performance report!

@dcherian , I did take a look at the ufunc approach you suggested, but the bag approach here seems more straightforward and likely to be comparable in performance. Do you agree?

mrocklin · 2020-01-11T17:39:47Z

The administrative tabs in the performance report still only report a few hundred milliseconds of time spent doing administrative work (like deserializing tasks or communicating and such). So I'm not sure how much overhead we're seeing there.

…

On Sat, Jan 11, 2020 at 5:07 AM Rich Signell ***@***.***> wrote: @guillaumeeb <https://github.com/guillaumeeb>, I tried 6,000 partitions instead of 60,000 partitions for my 500,000 tasks, which shortened my run time by only 4%, but it made my performance report 10 times smaller. Thanks for the idea! 😸 @mrocklin <https://github.com/mrocklin>, adding those lines to the dask config indeed did the trick. Now I have filled-in admin tabs in my new smaller Dask performance report <https://nbviewer.jupyter.org/github/rsignell-usgs/testing/blob/master/dask_6000.html> ! @dcherian <https://github.com/dcherian> , I did take a look at the ufunc approach <https://github.com/pydata/xarray/blob/8e5ec06f67f7130a3cc5bd4bf9dd1207b0211a90/doc/examples/apply_ufunc_vectorize_1d.ipynb> you suggested, but the bag approach here seems more straightforward and likely to be comparable in performance. Do you agree? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#194?email_source=notifications&email_token=AACKZTBZY7SFHVHPR3AQEK3Q5HACNA5CNFSM4EYXX5BKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIWBTDA#issuecomment-573315468>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTF46KIF4XOTJTVL6UDQ5HACNANCNFSM4EYXX5BA> .

rsignell-usgs · 2020-01-14T17:30:56Z

I just tried running the same job on a different HPC system with the same number of cores (120) and instead of taking 2.5 minutes it takes 25 minutes. The CPUS are not the same, but what could explain an order of magnitude difference in performance for these 500 tasks?
Here is the slow performance report.

mrocklin · 2020-01-14T18:18:17Z

It looks like you're spending a ton of time calling inv. You might look at your blas settings (over subscription of threads maybe?) and see if you can use a solve function instead?

…

On Tue, Jan 14, 2020, 9:30 AM Rich Signell ***@***.***> wrote: I just tried running the same job on a different HPC system with the same number of cores (120) and instead of taking 2.5 minutes it takes 25 minutes. The CPUS are not the same, but what could explain an order of magnitude difference in performance for these 500 tasks? Here is the slow performance report <https://nbviewer.jupyter.org/github/rsignell-usgs/testing/blob/master/dask_6000_poseidon.html> . — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#194?email_source=notifications&email_token=AACKZTBH67FJ4YXAMPAQ323Q5XZFDA5CNFSM4EYXX5BKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI5OU3A#issuecomment-574286444>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTFPY3WEEYGUMV7YIMDQ5XZFDANCNFSM4EYXX5BA> .

rsignell-usgs · 2020-02-24T13:30:22Z

Understanding (and using) dask bag allowed me to solve this use case.

guziy · 2021-08-24T16:10:17Z

I am doing detiding using ttide (found it a bit faster than utide), similar problem)) I have 1.5 million points.

I do chunking i.e. n points (500) per task. And another thing that helped to make workers start faster is batch_size of the client.map... But I still find that the jobs could be starting earlier, here is the gist of my workflow..

with get_client(args) as client:

        # save dashboard link to the disk

        with Path("dashboard_link.html").open("w") as f:

            f.write(f"""<html><body><a href="{client.dashboard_link}">Dashboard</></body></html>""")

        n_tasks = len(inputs_chunks)
       
        # launch execution

        logger.info("Submitting %d tasks", n_tasks)

        max_n_batches = max(int(0.01 * n_tasks), 1)

        batch_size = n_tasks // max_n_batches

        logger.info("batch_size for client.map: %d", batch_size)

        futures = client.map(compute_surge_map_many, inputs_chunks, batch_size=batch_size)

        progress(futures)

        chunk_results = client.gather(futures)

        vals = np.array(sum(chunk_results, start=[])).T

        logger.info(f"Client gathered results: type=%s", type(vals[0]))

Do you have any suggestions for better calculation of the batch size or maybe other optimizations?

Currently, it takes 20 minutes to process on 800 cpus (1 thread per process) and the task flow is not so nicely filled as yours on the dashboard)). It would be cool if I could save the results without gathering into one netcdf directly from the workers, but I read that this is not supported for netcdf4/hdf5....

jacobtomlinson added user support dask labels Apr 26, 2018

stale bot added the stale label Jun 25, 2018

stale bot closed this as completed Jul 2, 2018

rsignell-usgs reopened this Jan 7, 2020

stale bot removed the stale label Jan 7, 2020

rsignell-usgs closed this as completed Feb 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Struggling with dask delayed to parallelize tidal analysis #194

Struggling with dask delayed to parallelize tidal analysis #194

rsignell-usgs commented Apr 3, 2018 •

edited

Loading

ah- commented Apr 3, 2018

rsignell-usgs commented Apr 3, 2018 •

edited

Loading

rsignell-usgs commented Apr 4, 2018 •

edited

Loading

pbranson commented Apr 4, 2018 via email

mrocklin commented Apr 4, 2018

rsignell-usgs commented Apr 4, 2018

mrocklin commented Apr 4, 2018 via email

mrocklin commented Apr 4, 2018 via email

rsignell-usgs commented Apr 4, 2018 •

edited

Loading

mrocklin commented Apr 4, 2018 via email

stale bot commented Jun 25, 2018

stale bot commented Jul 2, 2018

rsignell-usgs commented Jan 7, 2020 •

edited

Loading

dcherian commented Jan 7, 2020

guillaumeeb commented Jan 7, 2020

rsignell-usgs commented Jan 7, 2020

mrocklin commented Jan 7, 2020 via email

rsignell-usgs commented Jan 8, 2020 •

edited

Loading

mrocklin commented Jan 8, 2020 via email

rsignell-usgs commented Jan 9, 2020

mrocklin commented Jan 9, 2020

rsignell-usgs commented Jan 10, 2020

rsignell-usgs commented Jan 10, 2020 •

edited

Loading

guillaumeeb commented Jan 11, 2020 •

edited

Loading

mrocklin commented Jan 11, 2020 via email

rsignell-usgs commented Jan 11, 2020

mrocklin commented Jan 11, 2020 via email

rsignell-usgs commented Jan 14, 2020

mrocklin commented Jan 14, 2020 via email

rsignell-usgs commented Feb 24, 2020

guziy commented Aug 24, 2021 •

edited

Loading

Struggling with dask delayed to parallelize tidal analysis #194

Struggling with dask delayed to parallelize tidal analysis #194

Comments

rsignell-usgs commented Apr 3, 2018 • edited Loading

ah- commented Apr 3, 2018

rsignell-usgs commented Apr 3, 2018 • edited Loading

rsignell-usgs commented Apr 4, 2018 • edited Loading

pbranson commented Apr 4, 2018 via email

mrocklin commented Apr 4, 2018

rsignell-usgs commented Apr 4, 2018

mrocklin commented Apr 4, 2018 via email

mrocklin commented Apr 4, 2018 via email

rsignell-usgs commented Apr 4, 2018 • edited Loading

mrocklin commented Apr 4, 2018 via email

stale bot commented Jun 25, 2018

stale bot commented Jul 2, 2018

rsignell-usgs commented Jan 7, 2020 • edited Loading

dcherian commented Jan 7, 2020

guillaumeeb commented Jan 7, 2020

rsignell-usgs commented Jan 7, 2020

mrocklin commented Jan 7, 2020 via email

rsignell-usgs commented Jan 8, 2020 • edited Loading

mrocklin commented Jan 8, 2020 via email

rsignell-usgs commented Jan 9, 2020

mrocklin commented Jan 9, 2020

rsignell-usgs commented Jan 10, 2020

rsignell-usgs commented Jan 10, 2020 • edited Loading

guillaumeeb commented Jan 11, 2020 • edited Loading

mrocklin commented Jan 11, 2020 via email

rsignell-usgs commented Jan 11, 2020

mrocklin commented Jan 11, 2020 via email

rsignell-usgs commented Jan 14, 2020

mrocklin commented Jan 14, 2020 via email

rsignell-usgs commented Feb 24, 2020

guziy commented Aug 24, 2021 • edited Loading

rsignell-usgs commented Apr 3, 2018 •

edited

Loading

rsignell-usgs commented Apr 3, 2018 •

edited

Loading

rsignell-usgs commented Apr 4, 2018 •

edited

Loading

rsignell-usgs commented Apr 4, 2018 •

edited

Loading

rsignell-usgs commented Jan 7, 2020 •

edited

Loading

rsignell-usgs commented Jan 8, 2020 •

edited

Loading

rsignell-usgs commented Jan 10, 2020 •

edited

Loading

guillaumeeb commented Jan 11, 2020 •

edited

Loading

guziy commented Aug 24, 2021 •

edited

Loading