Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dask cuml - TypeError: make_blobs() got an unexpected keyword argument 'workers'[BUG] #5048

Closed
vdet opened this issue Dec 2, 2022 · 4 comments · Fixed by #5057
Closed

dask cuml - TypeError: make_blobs() got an unexpected keyword argument 'workers'[BUG] #5048

vdet opened this issue Dec 2, 2022 · 4 comments · Fixed by #5057
Labels
bug Something isn't working doc Documentation

Comments

@vdet
Copy link

vdet commented Dec 2, 2022

Describe the bug
I tried to run this example notebook:
https://medium.com/rapids-ai/scaling-knn-to-new-heights-using-rapids-cuml-and-dask-63410983acfe#6084

Cell 9 issues a warning:

# Initialize Dask client
client = Client(LocalCUDACluster())
/opt/conda/envs/rapids/lib/python3.9/site-packages/distributed/node.py:182: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 46729 instead
  warnings.warn(
/opt/conda/envs/rapids/lib/python3.9/site-packages/dask/config.py:660: UserWarning: Configuration key "ucx.cuda_copy" has been deprecated. Please use "distributed.ucx.cuda_copy" instead
  warnings.warn(
/opt/conda/envs/rapids/lib/python3.9/site-packages/dask/config.py:660: UserWarning: Configuration key "ucx.tcp" has been deprecated. Please use "distributed.ucx.tcp" instead
  warnings.warn(
/opt/conda/envs/rapids/lib/python3.9/site-packages/dask/config.py:660: UserWarning: Configuration key "ucx.nvlink" has been deprecated. Please use "distributed.ucx.nvlink" instead
  warnings.warn(
[...]

but client seems fine, all 8 GPUs are visible:

list(client.has_what())
['tcp://127.0.0.1:32843',
 'tcp://127.0.0.1:35799',
 'tcp://127.0.0.1:37093',
 'tcp://127.0.0.1:37843',
 'tcp://127.0.0.1:38527',
 'tcp://127.0.0.1:40273',
 'tcp://127.0.0.1:44021',
 'tcp://127.0.0.1:44995']

The notebook then fails on cell 12

cleanup_memory(client)

n_features = 256
n_queries = 800
n_neighbors = 8

MAX_SAMPLES_PER_WORKER = 10000000

runtimes = {}
for n_samples in [500000, 1000000, 2000000, 4000000]:
        n_parts = int(n_samples / 500000)
        sk_runtime, cu_2_runtime = benchmark(client,
                                                 n_samples,
                                                 n_features,
                                                 n_queries,
                                                 n_neighbors,
                                                 n_parts=max(n_parts,2),
                                                 n_workers=2,
                                                 reference='scikit')
        cu_4_runtime = benchmark(client,
                                 n_samples,
                                 n_features,
                                 n_queries,
                                 n_neighbors,
                                 n_parts=n_parts,
                                 n_workers=max(n_parts,4),
                                 reference='noref')
        if n_samples not in runtimes:
            runtimes[n_samples] = {}
        runtimes[n_samples]['sk_runtime'] = sk_runtime
        runtimes[n_samples]['cu_2_runtime'] = cu_2_runtime
        runtimes[n_samples]['cu_4_runtime'] = cu_4_runtime
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_2029717/3260584239.py in <module>
     10 for n_samples in [500000, 1000000, 2000000, 4000000]:
     11         n_parts = int(n_samples / 500000)
---> 12         sk_runtime, cu_2_runtime = benchmark(client,
     13                                                  n_samples,
     14                                                  n_features,

/tmp/ipykernel_2029717/2652057255.py in benchmark(client, n_samples, n_features, n_queries, n_neighbors, n_parts, n_workers, reference)
      2 def benchmark(client, n_samples, n_features, n_queries, n_neighbors, n_parts=None, n_workers=None, reference='scikit'):
      3     # Generate distributed index and query
----> 4     dist_index, dist_query = generate_dist_dataset(client, n_samples, n_features, n_queries, n_parts=n_parts, n_workers=n_workers)
      5 
      6     # Bench distributed cuML (multiple measures to compensate for variability)

/tmp/ipykernel_2029717/2585364515.py in generate_dist_dataset(client, n_samples, n_features, n_queries, n_parts, n_workers)
      8 
      9     # Generate index of n_parts partitions on n_workers workers
---> 10     index, _ = dist_make_blobs(client=client,
     11                                workers=workers,
     12                                n_parts=n_parts,

/opt/conda/envs/rapids/lib/python3.9/site-packages/cuml/common/memory_utils.py in cupy_rmm_wrapper(*args, **kwargs)
    125     def cupy_rmm_wrapper(*args, **kwargs):
    126         with cupy_using_allocator(rmm.rmm_cupy_allocator):
--> 127             return func(*args, **kwargs)
    128 
    129     # Mark the function as already wrapped

TypeError: make_blobs() got an unexpected keyword argument 'workers'

Steps/Code to reproduce bug

  • Install rapids from nightly docker image (Nov 30th, cuml version is 22.12.00a+49.g07f0bc4ba)
  • Add to cell 1
 from dask_cuda import LocalCUDACluster
  • Change cell 9 to
# Initialize Dask client
client = Client(LocalCUDACluster())
  • Run notebook

Expected behavior
Notebook runs with no error.

Environment details (please complete the following information):

  • Environment location: Docker contained via sudo singularity build
  • Linux Distro/Architecture: [Ubuntu 20.04]
  • GPU Model/Driver: 8 x A5000
  • CUDA: [11.5]
  • Method of cuDF & cuML install: [conda, Docker, or from source]
    • If method of install is [Docker], provide docker pull & docker run commands used
sudo singularity build rapidsai.sif docker://nvcr.io/nvidia/rapidsai/rapidsai-core-dev:22.10-cuda11.5-devel-ubuntu20.04-py3.9

Additional context
Add any other context about the problem here.

@vdet vdet added ? - Needs Triage Need team to review and classify bug Something isn't working labels Dec 2, 2022
@dantegd
Copy link
Member

dantegd commented Dec 4, 2022

Thanks for the issue @vdet, it seems like it is an error on the notebook included in that blog. The function make_blobs doesn't take any workers parameter: https://docs.rapids.ai/api/cuml/stable/api.html#cuml.dask.datasets.blobs.make_blobs

cc @viclafargue author of the blog post

@dantegd dantegd added doc Documentation and removed ? - Needs Triage Need team to review and classify labels Dec 4, 2022
@vdet
Copy link
Author

vdet commented Dec 5, 2022

Indeed I could run the code after removing workers when calling make_blobs. Thanks!

@vdet vdet closed this as completed Dec 5, 2022
@viclafargue
Copy link
Contributor

viclafargue commented Dec 5, 2022

Thanks for noticing the issue @vdet. At the time, I was working with a modified version of make_blobs that would allow to leave out some of the workers of the Dask cluster instead of using them all. This was for the purpose of the benchmark, but most people would use all of the workers on a regular basis. It's true that it is unfortunate though that the snippet do not work directly. Allowing workers as an optional parameter here would probably help.

@vdet
Copy link
Author

vdet commented Dec 5, 2022

Thanks you all for your answer!

rapids-bot bot pushed a commit that referenced this issue Dec 7, 2022
jakirkham pushed a commit to jakirkham/cuml that referenced this issue Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working doc Documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants