Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

HPO studio notebook fails when importing cuDF #185

Open
hcho3 opened this issue Aug 16, 2022 · 4 comments
Open

HPO studio notebook fails when importing cuDF #185

hcho3 opened this issue Aug 16, 2022 · 4 comments

Comments

@hcho3
Copy link
Contributor

hcho3 commented Aug 16, 2022

https://github.com/rapidsai/cloud-ml-examples/blob/main/aws/rapids_studio_hpo.ipynb

Traceback (most recent call last):
  File "train.py", line 76, in <module>
    train()
  File "train.py", line 27, in train
    ml_workflow = create_workflow(hpo_config)
  File "/opt/ml/code/MLWorkflow.py", line 37, in create_workflow
    from workflows.MLWorkflowSingleGPU import MLWorkflowSingleGPU
  File "/opt/ml/code/workflows/MLWorkflowSingleGPU.py", line 21, in <module>
    import cudf
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/cudf/__init__.py", line 71, in <module>
    from cudf.io import (
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/cudf/io/__init__.py", line 8, in <module>
    from cudf.io.orc import read_orc, read_orc_metadata, to_orc
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/cudf/io/orc.py", line 14, in <module>
    from cudf.utils.metadata import (  # type: ignore
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/cudf/utils/metadata/orc_column_statistics_pb2.py", line 7, in <module>
    from google.protobuf.internal import builder as _builder
ImportError: cannot import name 'builder' from 'google.protobuf.internal' (/opt/conda/envs/rapids/lib/python3.8/site-packages/google/protobuf/internal/__init__.py)

This is likely due to a mismatch in Protobuf versions.

@jacobtomlinson
Copy link
Member

The title says parquet but the traceback shows cudf/io/orc.py which suggests the data is ORC.

@hcho3
Copy link
Contributor Author

hcho3 commented Aug 19, 2022

Actually the error occurs when import cudf is executed (before read_parquet is called). When you import cuDF, you also import cudf.io.orc and that import fails.

@hcho3 hcho3 changed the title HPO studio notebook fails when loading from Parquet HPO studio notebook fails when importing cuDF Aug 19, 2022
@jacobtomlinson
Copy link
Member

That makes sense, do you think it would be good to open an issue on cudf about this if it is the import that fails?

@hcho3
Copy link
Contributor Author

hcho3 commented Aug 19, 2022

I think this issue is specific to the particular environment. Let me verify.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants