-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apply_vocabulary lookup table initialization needs to be wrapped inside tf.init_scope
#249
Comments
@EdwardCuiPeacock I don't quite understand how you are getting that warning message, it appears as though transform/tensorflow_transform/tf_utils.py Line 1578 in b06e87b
|
As Chris commented above, we do enter the tf.init_scope in tf_utils. Are you passing a lookup_fn to tft.apply_vocabulary? If that is the case, you would need to lift the table creation inside that lookup_fn as TFT does not have access to the table creation code to do this automatically. Could you give me an example of what your calls to tft.vocabulary and tft.apply_vocabulary look like so I can take a look and see if we missed something? |
I've encountered the same warning while using TFX, in my case the transform/tensorflow_transform/tf_utils.py Lines 1654 to 1655 in 9b348c8
asset_filepath is <class 'tensorflow.python.framework.ops.Tensor'> .
My usage is: transformed = tft.compute_and_apply_vocabulary(
input_tensor,
frequency_threshold=100,
num_oov_buckets=1,
vocab_filename='tags'
) It isn't a big issue for me as my vocabulary is small, so I haven't spent much time looking into it and don't have a proper MRE, but maybe that much is helpful. |
I've just checked using this notebook: https://colab.research.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/tfx/components_keras.ipynb#scrollTo=jHfhth_GiZI9, and the warning appears with it also. Possibly related to being wrapped with transform/tensorflow_transform/analyzers.py Line 1973 in 520ebb4
|
Similar to @gfkeith , I also get this warning when using |
We recently encountered scalability issues when trying to apply the vocabularies for multiple (5 to be exact) categorical features. We saw multiple lines of the follwoing warning message:
When using the
tft.apply_vocabulary
, the job would stuck on the transformation steps for hours, consuming thousands of CPU hours if we do not kill it early.Creating a custom lookup table initialization function like the following could bypass the proble; 80M rows of data only took 35 min, consuming ~20 hours of CPU time.
Relevant code need to be addressed:
transform/tensorflow_transform/mappers.py
Line 1114 in 520ebb4
This probably needs to be applied to versions of TFT starting from 1.0
The text was updated successfully, but these errors were encountered: