Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRITICAL? TFDV 0.21.2 now relies on record_based_tfxio.py in tfx_bsl that has not yet been released #110

Closed
jasonbrancazio opened this issue Feb 20, 2020 · 6 comments

Comments

@jasonbrancazio
Copy link

This is related to tensorflow/tfx-bsl#3

The new release of tfdv depends on a file in tfx_bsl that has not been released yet.

STEPS TO REPRODUCE LOCALLY (assumes you have virtualenv and virtualenvwrapper installed):

mkvirtualenv jb_testing_tfdv_2 --python=python3.7
pip install tfx==0.21.0 tensorflow==2.1 tensorboard==2.1 tensorflow-data-validation==0.21.2
python
import tensorflow_data_validation

observe the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jbrancazio/Virtualenvs/jb_testing_tfdv_2/lib/python3.7/site-packages/tensorflow_data_validation/__init__.py", line 33, in <module>
    from tensorflow_data_validation.coders.csv_decoder import DecodeCSV
  File "/Users/jbrancazio/Virtualenvs/jb_testing_tfdv_2/lib/python3.7/site-packages/tensorflow_data_validation/coders/csv_decoder.py", line 26, in <module>
    from tfx_bsl.tfxio import record_based_tfxio
ImportError: cannot import name 'record_based_tfxio' from 'tfx_bsl.tfxio' (/Users/jbrancazio/Virtualenvs/jb_testing_tfdv_2/lib/python3.7/site-packages/tfx_bsl/tfxio/__init__.py)

STEPS TO REPRODUCE (colab):

!pip install tfx==0.21.0 tensorflow==2.1 tensorflow-data-validation==0.21.2
# restart runtime
import tensorflow_data_validation

observe the following error:

ImportError                               Traceback (most recent call last)
<ipython-input-1-65f1eba19dbf> in <module>()
----> 1 import tensorflow_data_validation

1 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_data_validation/__init__.py in <module>()
     31 
     32 # Import coders.
---> 33 from tensorflow_data_validation.coders.csv_decoder import DecodeCSV
     34 from tensorflow_data_validation.coders.tf_example_decoder import DecodeTFExample
     35 

/usr/local/lib/python3.6/dist-packages/tensorflow_data_validation/coders/csv_decoder.py in <module>()
     24 from tensorflow_data_validation import types
     25 from tfx_bsl.coders import csv_decoder as csv_decoder
---> 26 from tfx_bsl.tfxio import record_based_tfxio
     27 from typing import List, Iterable, Optional, Text
     28 

ImportError: cannot import name 'record_based_tfxio'
@paulgc
Copy link
Member

paulgc commented Feb 20, 2020

@jasonbrancazio There will be a new tfx_bsl release today which should fix this issue.

@rmothukuru rmothukuru self-assigned this Feb 24, 2020
@rmothukuru
Copy link

@jasonbrancazio,
Could run the command successfully in the Google Colab, as per the comment of @paulgc.
Please confirm if the issue is resolved and if we can close this issue. Thanks!

@jasonbrancazio
Copy link
Author

resolved, though theoretically the requirement of tfx-bsl<0.22,>=0.21 could still expose a user to the bug in rare cases. A user needs tfx-bsl 0.21.2 or greater.

@rmothukuru
Copy link

@jasonbrancazio,
Can you please let us know if we can close this issue. Thanks!

@jasonbrancazio
Copy link
Author

@rmothukuru Please see my comment above. You can consider editing https://github.com/tensorflow/data-validation/blob/master/setup.py#L101 to require tfx-bsl 0.21.2 or greater. Other than that, I consider the issue resolved.

@arghyaganguly
Copy link
Contributor

Closing this as fix is available.Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants