-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configuration Profiles #1020
Comments
Thanks a lot for starting this discussion @quasiben. These set of yaml files for pre-baked profiles is something I'm in favor of. I had a few questions around the usage of configuration files:
Generally I'm happy to help providing input for some of these default options, and also curious to hear if others opinions on what high level defaults make the most sense. |
Agree with this. Would make profiles less specific to a particular device type |
It seems that it would work for the scheduler as well, but maybe @quasiben can confirm this. This is nevertheless a good question, the client must also pick the same configs.
Theoretically we could pass this to Dask, but I'm generally -1 on that because we would need to create a new Dask config for every possible UCX configuration. I think we need to think of a more robust way to address this rather than individually adding a new config for every UCX option, but I don't know if we can add in Dask some kind of config translation, e.g.,
That's a good point, addressing this now in #1021 . |
We can, with something like this: diff --git a/distributed/comm/ucx.py b/distributed/comm/ucx.py
index fc07d048..254918cc 100644
--- a/distributed/comm/ucx.py
+++ b/distributed/comm/ucx.py
@@ -126,7 +126,16 @@ def init_once():
ucp = _ucp
- ucp.init(options=ucx_config, env_takes_precedence=True)
+ def munge(key):
+ return "_".join(map(str.upper, key.split("-")))
+
+ environment = {
+ munge(k): v
+ for k, v in dask.config.get("distributed.comm.ucx.environment", {}).items()
+ }
+ # Specific ucx_config options should override general environment ones
+ environment.update(ucx_config)
+ ucp.init(options=environment, env_takes_precedence=True)
pool_size_str = dask.config.get("distributed.rmm.pool-size")
diff --git a/distributed/distributed-schema.yaml b/distributed/distributed-schema.yaml
index 4e31840d..d8207e5a 100644
--- a/distributed/distributed-schema.yaml
+++ b/distributed/distributed-schema.yaml
@@ -961,7 +961,14 @@ properties:
additional variables for each transport, while ensuring optimal connectivity. When
``True``, a CUDA context will be created on the first device listed in
``CUDA_VISIBLE_DEVICES``.
-
+ environment:
+ type: [object, 'null']
+ description: |
+ Mapping setting specified UCX environment variables.
+ A name
+ ``distributed.comm.ucx.environment.some-option=value``
+ is equivalent to setting ``UCX_SOME_OPTION=value` in
+ the environment.
tcp:
type: object
properties:
diff --git a/distributed/distributed.yaml b/distributed/distributed.yaml
index 81033703..821232b1 100644
--- a/distributed/distributed.yaml
+++ b/distributed/distributed.yaml
@@ -229,7 +229,8 @@ distributed:
infiniband: null # enable Infiniband
rdmacm: null # enable RDMACM
create-cuda-context: null # create CUDA context before UCX initialization
-
+ environment:
+ max_copy_reg: 1
zstd:
level: 3 # Compression level, between 1 and 22.
threads: 0 # Threads to use. 0 for single-threaded, -1 to infer from cpu count. Then I can do: In[3]: dask.config.get("distributed.comm.ucx.environment")
Out[3]: {'max_copy_reg': 1} |
I'm 100% onboard with this idea @wence- , could you submit a PR to distributed? And thanks for one more neat solution Lawrence-style! |
More than happy to help on this . We can couple of these one for
I think we are all in agreement about allowing memory fraction is a good idea for defaults
I think we can think about creating one config in the cugraph repo too to help customers and users to the right configuration. |
This is already supported by `memory_limit`/`device_memory_limit`, and this has been raised in #1020 during discussions on how to make Dask Profiles usable in Dask-CUDA. Authors: - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #1021
#1021 is now in and should allow specifying |
Generic UCX environment setting via dask config is dask/distributed#7164 |
We've had several issues come in related to defaults/cluster configurations (#990 / #348 / ...) and a general request to support "Configuration Profiles". As there are many ways to configure a Dask Cluster: CLI, ENV Var, Yaml Config, etc -- users would like to have a higher level concept i.e. a Profile, which codifies are particular configuration. This might be a configuration related to performance, debugging, benchmarking, environment (cloud vs hpc, vs local), etc.
Earlier today, it was pointed out to me that Dask does in fact check a couple of locations for YAML config files and the location itself is also configurable via the
DASK_CONFIG
env var:For example, we can create the following RAPIDS-TPCDS/distributed.yaml
Then load like the following:
This is quite useful and would unblock, I think, several folks. However, I don't think we are done yet.
all
buttonFor problem 1) and 2), what would folks think about building a config system inside of dask-cuda. This would be similar to what dask-kubernetes and dask-sql do. If we had this, then would could also think moving/reducing the configuration inside distributed.
For 3) would could ask folks like @VibhuJawa @ayushdg and @randerzander for their input in building their defaults as highlighted options
Another thing we could also think about is pushing the DASK_CONFIG env var to a CLI option for workers and schedulers;
The text was updated successfully, but these errors were encountered: