Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF addons compatibility with TF nightly / TF2.2 #1716

Closed
nnigania opened this issue Apr 22, 2020 · 8 comments
Closed

TF addons compatibility with TF nightly / TF2.2 #1716

nnigania opened this issue Apr 22, 2020 · 8 comments

Comments

@nnigania
Copy link

I am using a tf addons library function (Gelu) with TF2.2 nightly. It currently complains about compatibility issue below. The aim of this bug to cleanup this warning.

"
/usr/local/lib/python3.6/dist-packages/tensorflow_addons/utils/resource_loader.py:95: UserWarning: You are currently using TensorFlow 2.2.0-dev20200421 and trying to load a custom op (custom_ops/activations/_activation_ops.so).
TensorFlow Addons has compiled its custom ops against TensorFlow 2.1.0, and there are no compatibility guarantees between the two versions.
This means that you might get segfaults when loading the custom op, or other kind of low-level errors.
If you do, do not file an issue on Github. This is a known limitation.

It might help you to fallback to pure Python ops with TF_ADDONS_PY_OPS . To do that, see https://github.com/tensorflow/addons#gpucpu-custom-ops

You can also change the TensorFlow version installed on your system. You would need a TensorFlow version equal to or above 2.1.0 and strictly below 2.2.0.
Note that nightly versions of TensorFlow, as well as non-pip TensorFlow like conda install tensorflow or compiled from source are not supported.

The last solution is to find the TensorFlow Addons version that has custom ops compatible with the TensorFlow installed on your system. To do that, refer to the readme: https://github.com/tensorflow/addons
UserWarning,
/usr/local/lib/python3.6/dist-packages/tensorflow_addons/options.py:47: RuntimeWarning: Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_addons/activations/gelu.py", line 49, in gelu
return _gelu_custom_op(x, approximate)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_addons/activations/gelu.py", line 57, in _gelu_custom_op
return _activation_so.ops.addons_gelu(x, approximate)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_addons/utils/resource_loader.py", line 56, in ops
self._ops = tf.load_op_library(get_path_to_datafile(self.relative_path))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /usr/local/lib/python3.6/dist-packages/tensorflow_addons/custom_ops/activations/_activation_ops.so: undefined symbol: _ZN10tensorflow14kernel_factory17OpKernelRegistrar12InitInternalEPKNS_9KernelDefEN4absl11string_viewESt10unique_ptrINS0_15OpKernelFactoryESt14default_deleteIS8_EE

The gelu C++/CUDA custom op could not be loaded.
For this reason, Addons will fallback to an implementation written
in Python with public TensorFlow ops. There worst you might experience with
this is a moderate slowdown on GPU. There can be multiple
reason for this loading error, one of them may be an ABI incompatibility between
the TensorFlow installed on your system and the TensorFlow used to compile
TensorFlow Addons' custom ops. The stacktrace generated when loading the
shared object file was displayed above.
"

@nnigania
Copy link
Author

@tomerk
@dynamicwebpaige

@seanpmorgan
Copy link
Member

seanpmorgan commented Apr 22, 2020

Hi @nnigania is there information you find unclear in this warning? Or something you would like changed?

For context we build C++ custom-ops against a specific version of TF (in this case TF2.1). TensorFlow does not supply stability guarantees for linking to libtensorflow_framework. You can find more information and troubleshooting in this thread or in our central README:
#1298 (comment)
https://github.com/tensorflow/addons#c-custom-op-compatibility

We will be releasing a TFA 0.10 version which will have custom-op compatibility with TF2.2 once the stable TF2.2 release is made. Ultimately this is an issue with stability guarentees from core TF and not something we can fix in Addons. You can supress this warning by setting the environment variable as mentioned in the warning so that the python composite op is used instead of the custom-op. We additionally have plans for that to be the default setting for activation functions.

@tomerk
Copy link
Contributor

tomerk commented Apr 22, 2020

Hi @seanpmorgan, just to clarify here:
We're trying to do some internal performance testing that I believe requires using a custom op.

So, @nnigania tried used tf nightly with tensorflow addons. If the custom ops are not compatible, does that mean the experiments need to build addons at head rather than pip installing addons?

Or, is there a way to try supressing the warning while still trying to use the custom op instead of the python composite op? (And if it segfaults so be it)

@seanpmorgan
Copy link
Member

Hi @seanpmorgan, just to clarify here:
We're trying to do some internal performance testing that I believe requires using a custom op.

So, @nnigania tried used tf nightly with tensorflow addons. If the custom ops are not compatible, does that mean the experiments need to build addons at head rather than pip installing addons?

Or, is there a way to try supressing the warning while still trying to use the custom op instead of the python composite op? (And if it segfaults so be it)

Interesting, got it. So the "correct" solution here is probably to compile TFA against TF2.2. Information to do that can be found here:
https://github.com/tensorflow/addons#cpugpu-custom-ops

As a workaround you could try installing tensorflow-addons==0.8.3. This version was built against TF2.1 and did not have the python fallback & warning message. The custom-op implementation is the same, but there is a chance it'll crash (would be quick to find out)

Please keep us in the loop of your findings if possible. We've done some of our own benchmarking since we've found that it's difficult to maintain custom-ops (as shown by this warning haha):
tensorflow/tensorflow#33945 (comment)

@seanpmorgan
Copy link
Member

seanpmorgan commented Apr 22, 2020

Unfortunately it crashes:
https://colab.research.google.com/drive/1Ea70IA0GxhiwoeLOYrjbHBOxk2_AMN1C

If you really don't want to build from source you could use a wheel from our CI artifacts (built against 2.2rc3):
https://github.com/tensorflow/addons/suites/616960586/artifacts/4811368

@seanpmorgan
Copy link
Member

Closing this issue because this is outside of what we support for general usage. TFA 0.10 will be released shortly after TF2.2 for general purpose usage.

Feel free to continue discussion in this thread though regarding your benchmarking and we'll try to help as best we can.

@nnigania
Copy link
Author

thanks for your response. Since I am not building from source, I will use the artifact you shared for the time being. Will update incase I still hit any issues and look out for TFA0.1.

@nnigania
Copy link
Author

I tried using the artifact, but since I am using the TF nightly, it seems tf addons is incompatible with TF nightly. What would be a suggested solution to make it work with tf nightly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants