Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Visualization] tfdv visualization throws an error #3079

Closed
Bobgy opened this issue Feb 14, 2020 · 5 comments
Closed

[Visualization] tfdv visualization throws an error #3079

Bobgy opened this issue Feb 14, 2020 · 5 comments

Comments

@Bobgy
Copy link
Contributor

Bobgy commented Feb 14, 2020

What happened:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-114-d62b8c370a9d> in <module>
----> 1 import tensorflow_data_validation as tfdv
      2 stats = tfdv.load_statistics('gs://hostedkfp-default-0p59u0vc8g/tfx_taxi_simple/88332a30-3d2a-4c1e-a9e7-f35f972523ff/StatisticsGen/statistics/2/eval/stats_tfrecord')
      3 tfdv.visualize_statistics(stats)

/usr/local/lib/python3.6/dist-packages/tensorflow_data_validation/__init__.py in <module>
     22 
     23 # Import stats API.
---> 24 from tensorflow_data_validation.api.stats_api import GenerateStatistics
     25 
     26 # Import validation API.

/usr/local/lib/python3.6/dist-packages/tensorflow_data_validation/api/stats_api.py in <module>
     50 from tensorflow_data_validation import constants
     51 from tensorflow_data_validation import types
---> 52 from tensorflow_data_validation.statistics import stats_impl
     53 from tensorflow_data_validation.statistics import stats_options
     54 from typing import Generator

/usr/local/lib/python3.6/dist-packages/tensorflow_data_validation/statistics/stats_impl.py in <module>
     31 from tensorflow_data_validation import constants
     32 from tensorflow_data_validation import types
---> 33 from tensorflow_data_validation.arrow import arrow_util
     34 from tensorflow_data_validation.statistics import stats_options
     35 from tensorflow_data_validation.statistics.generators import basic_stats_generator

/usr/local/lib/python3.6/dist-packages/tensorflow_data_validation/arrow/arrow_util.py in <module>
     22 import pyarrow as pa
     23 from tensorflow_data_validation import types
---> 24 from tfx_bsl.arrow import array_util
     25 from typing import Iterable, Optional, Text, Tuple
     26 

/usr/local/lib/python3.6/dist-packages/tfx_bsl/arrow/array_util.py in <module>
     15 # pytype: disable=import-error
     16 # pylint: disable=wildcard-import
---> 17 from tfx_bsl.cc.tfx_bsl_extension.arrow.array_util import *
     18 # pytype: enable=import-error
     19 # pylint: enable=wildcard-import

ImportError: libarrow.so.15: cannot open shared object file: No such file or directory

After #3060, tfdv throws an error.

What did you expect to happen:
It should show visualization.

What steps did you take:
[A clear and concise description of what the bug is.]

  1. use kfp 0.2.2
  2. run tfx sample
  3. view statisticsgen step
  4. the tab shows broken visualization

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

@Bobgy
Copy link
Contributor Author

Bobgy commented Feb 14, 2020

/assign @jingzhang36
/kind bug
/priority p0

@jingzhang36
Copy link
Contributor

After we bump tfdv version to 0.21.1, our visualization server seems to have a pyarrow at 0.16.x. We'll have to pin it to 0.15.0

@Bobgy
Copy link
Contributor Author

Bobgy commented Feb 14, 2020

That seems another failure point to let us prioritize #3078

@Bobgy
Copy link
Contributor Author

Bobgy commented Feb 14, 2020

Thanks for the quick check!

@rmgogogo
Copy link
Contributor

+rmgogogo
we may provide a quick patch after the fix (only patch visualization server)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants