Big TF test cleanup #24282

Rocketknight1 · 2023-06-14T16:58:19Z

Now we've done a big overhaul of the TF model internals, a lot of tests can be fixed. Several tests were disabled for being buggy or too slow - these are almost all performant now, so I re-enabled them. Runtime for the re-enabled tests was 15-20 seconds on my local machine.

Also, we had a number of TF test failures in the daily CI. I think this PR should fix all of them, except for two cases:

Firstly, some models have issues with resize_token_embeddings. These failures are caused by the transition to TFSharedEmbedding that @gante is currently working on, and I didn't want to interfere! The usual cause is that resize_token_embeddings replaces the new-style TFSharedEmbedding with an old tf.Variable.

Secondly, there are a couple of failures in generate tests. I'm also leaving this to @gante because he knows much more about that code than me 😅

HuggingFaceDocBuilderDev · 2023-06-14T17:30:04Z

The documentation is not available anymore as the PR was closed or merged.

gante · 2023-06-14T17:34:40Z

src/transformers/models/blip/modeling_tf_blip.py

        decoder_input_ids: tf.Tensor | None = None,
        decoder_attention_mask: tf.Tensor | None = None,
        attention_mask: tf.Tensor | None = None,
        output_attentions: Optional[bool] = None,
-        foutput_attentions: Optional[bool] = None,


gante

LGTM 👍

src/transformers/models/opt/modeling_tf_opt.py

gante · 2023-06-14T17:47:35Z

(@Rocketknight1 ping me if the gen tests are not sorted after the latest push)

ydshieh

Thank you for all this TF things!

Other than making sure (all) the re-enabled tests will pass now (I guess you already checked them.), I have just 2 nit comments.

ydshieh · 2023-06-15T08:08:49Z

src/transformers/modeling_tf_utils.py

+            # Set the serving spec quickly to ensure that Keras doesn't use the specific dummy input shapes as the spec
+            self._set_save_spec(self._prune_signature(self.input_signature))


Maybe explain a bit why we don't do this in init method?

It's a bit of a long story! _set_save_spec is normally called internally by Keras, and for subclassed models (i.e. all models in transformers) it uses the first input shapes the model sees. This was a huge problem for us, because we'd pass in some tiny dummy inputs and it would just lock in that useless specific shape as the model's save spec. This made exporting/serving a real nightmare!

We avoid that by setting a correct, general save spec before the model has seen any inputs. It doesn't really matter in most cases whether we put that in the __init__ or the build method, as long as it happens before we pass dummy inputs in. However, there is one edge case where it makes a small difference: If the user builds a model from a config, and then passes inputs of a specific shape in. In this case, putting it in build() allows the user to set the save spec with their own inputs, which can be useful in a couple of cases.

I'm not convinced this is a perfect solution, but it resolves an edge case in our in-graph tokenizer test, so it seems a little better than the alternative!

So a short version could be

# put this in build() allows the user to set the save spec with their own inputs.

😄 ...?

Sure! I'll comment something like that

ydshieh · 2023-06-15T08:10:09Z

src/transformers/models/xglm/modeling_tf_xglm.py

@@ -463,19 +463,12 @@ def _prepare_decoder_attention_mask(
    ) -> tf.Tensor:
        # create causal mask
        # [bsz, seq_len] -> [bsz, 1, tgt_seq_len, src_seq_len]
-        combined_attention_mask: tf.Tensor | None = None
-        if input_shape[-1] > 1:


we don't need to check this condition anymore ..?

I don't think so! I couldn't see a case where input_shape[-1] == 0 was possible.

I mean > 1 vs == 1 not == 0

Ah, you're totally right, I don't know how I blanked on that! Let me fix it.

Done, and sorry for the extremely embarrassing oversight where my eyes kept reading > as >=!

ydshieh · 2023-06-15T08:12:24Z

tests/models/bart/test_modeling_tf_bart.py

-    @tooslow
-    def test_saved_model_creation(self):
-        pass
-


This is still a slow test, and I would like to know if this re-enabled test pass now.

ydshieh · 2023-06-15T08:13:29Z

tests/models/groupvit/test_modeling_tf_groupvit.py

-    @unittest.skip(reason="Currently `saved_model` doesn't work with nested outputs.")
-    @slow
-    def test_saved_model_creation_extended(self):
-        pass


Would like to know if this re-enabled test pass now.

saved_model_creation_extended is now a core test that is only run on a few models because it's very expensive, so this skip is no longer needed.

ydshieh · 2023-06-15T08:14:20Z

tests/models/led/test_modeling_tf_led.py

-    def test_xla_mode(self):
-        # TODO JP: Make LED XLA compliant
-        pass


Would like to know if this re-enabled test pass now.

amyeroberts

Thanks for cleaning up!

Overall, changes look good to me.

Big +1 to all of @ydshieh's comments
For just the affected models, could you run the slow tests this changes? In particular test_saved_model_creation_extended?
Could you run a generation test with speech to text to make sure the embeddings reshaping is working?

amyeroberts · 2023-06-15T10:42:28Z

tests/models/rag/test_modeling_tf_rag.py

@@ -490,6 +490,7 @@ def test_model_without_retriever(self):
        inputs_dict = self.config_and_inputs
        self.check_model_without_retriever(**inputs_dict)

+    @slow


Are there related to failures in the generate tests - or did they just become slow?

If failures - could you add a unittest.skip decorator instead?

These aren't failures! They're just extremely slow generation tests (with retrieval!), and were sometimes triggering the 120s timeout in the live CI.

amyeroberts · 2023-06-15T10:56:01Z

src/transformers/modeling_tf_utils.py

+            # Set the serving spec quickly to ensure that Keras doesn't use the specific dummy input shapes as the spec
+            self._set_save_spec(self._prune_signature(self.input_signature))


amyeroberts · 2023-06-15T11:05:17Z

src/transformers/models/speech_to_text/modeling_tf_speech_to_text.py

+        # idempotent. TF doesn't need that caching anyway, since it can just store constants during compilation,
+        # so we just remove all of that code.
+        embeddings = self._get_embedding(
+            self.padding_idx + 1 + seq_len + self.offset + past_key_values_length, self.embedding_dim, self.padding_idx


This isn't exactly the same as past_key_values_length wasn't added before. This seems more correct but can we run some tests on generation to make sure this works as expected?

Hi @amyeroberts, you're right! It's actually okay to sometimes generate too many embeddings, though, because the embeddings tensor is only transiently created in this function, gathered from and then discarded again. I ran the slow tests for this model and all passed.

...though as I write this, I realize that all of this code is just a speed hack because eager Torch code can't optimize or compute things out of order, so really I should just directly transform the position IDs into the embeddings and skip the whole gather!

amyeroberts · 2023-06-15T11:16:05Z

src/transformers/models/transfo_xl/modeling_tf_transfo_xl.py

+            )
+            dec_attn_mask = upper_mask + lower_mask
+        else:
+            dec_attn_mask = upper_mask


This is a lot easier to understand :)

amyeroberts · 2023-06-15T11:16:50Z

src/transformers/models/xglm/modeling_tf_xglm.py

+        combined_attention_mask = _make_causal_mask(input_shape, past_key_values_length)
+        if attention_mask is None:
+            return combined_attention_mask
+        else:


ultra nit: if we return in an if statement, we don't need the else

amyeroberts · 2023-06-15T11:18:15Z

src/transformers/models/xglm/modeling_tf_xglm.py

@@ -463,19 +463,12 @@ def _prepare_decoder_attention_mask(
    ) -> tf.Tensor:
        # create causal mask
        # [bsz, seq_len] -> [bsz, 1, tgt_seq_len, src_seq_len]
-        combined_attention_mask: tf.Tensor | None = None
-        if input_shape[-1] > 1:


amyeroberts · 2023-06-15T11:29:31Z

tests/models/lxmert/test_modeling_tf_lxmert.py

-        pass
-
-    @slow
-    def test_saved_model_creation_extended(self):


Does this run now with the default test_saved_model_creation_extended test?

test_saved_model_creation_extended is now a core test that is only run on a few models because it's very expensive, so this skip is no longer needed.

amyeroberts

Thanks for iterating!

I just have one question to address re the refactoring of _prepare_decoder_attention_mask before merging

amyeroberts · 2023-06-15T17:31:42Z

tests/models/mobilebert/test_modeling_tf_mobilebert.py

-    @slow
-    def test_keras_fit(self):


Just double checking that this is now fast for this model?

Actually, good question. On my local machine, this test takes from 20-40 seconds depending on model. MobileBERT is one of the slower ones, but it's still inside that range.

However, 20-40 seconds is probably in the range that the whole test should be marked as slow to keep it out of the quick CI, right?

We have some tests (not decorated as slow) run more than 40 seconds, but let's not add more such tests. Decorate it as slow and everyone's life is easier 🍺

🍻 cheers to that!

amyeroberts · 2023-06-15T17:57:46Z

src/transformers/models/xglm/modeling_tf_xglm.py

-        return combined_attention_mask
+        combined_attention_mask = _make_causal_mask(input_shape, past_key_values_length)
+        combined_attention_mask = tf.cond(
+            input_shape[-1] > 1, lambda: combined_attention_mask, lambda: tf.ones_like(combined_attention_mask)


Is this completely equivalent?

If I've understood before and after correctly, if attention_mask is not None and input_shape[-1] == 1, then in the old case:

combined_attention_mask = expand_attention_mask

and in the new:

combinded_attention_mask = expand_attention_mask + tf.ones_like(combined_attention_mask)

i.e. an additional matrix of 1s is added

Ah, I think you're right! Hang on, let me see what I can do.

(This issue, like most of our issues, is caused by me assuming that tests passing = no problems)

Investigation complete! So basically, I can't actually make this function reproduce the old behaviour when we're compiling with flexible shapes, because it's just totally forbidden in TF to have a conditional where one branch returns None and the other branch returns a tf.Tensor.

However, when I actually followed the code through to self-attention where the attention mask is used, an attention mask of None is just treated as an all-ones mask (i.e. neither affects the attention logits at all). Therefore, returning all-ones instead of None yields the same model outputs, while obeying TF's requirements for compiling conditionals.

Rocketknight1 · 2023-06-16T13:25:58Z

I think everything has been addressed now, but I'm not going to merge this one today because there's another PR affecting our tests (#24301) and ideally I'd like to be able to separately view their impact on the CI!

…el_creation

ydshieh · 2023-06-16T14:41:57Z

I think everything has been addressed now, but I'm not going to merge this one today

Nice 👍 .

I never merge PRs on Firday evening or early afternoon. I don't want to get a ☎️ ⚡ !

ydshieh · 2023-06-16T14:42:45Z

Wait, you merged ...!? (but you said you are not going to merge 🤔 )

Rocketknight1 requested review from gante, amyeroberts and ydshieh June 14, 2023 16:58

gante reviewed Jun 14, 2023

View reviewed changes

gante approved these changes Jun 14, 2023

View reviewed changes

src/transformers/models/opt/modeling_tf_opt.py Show resolved Hide resolved

ydshieh approved these changes Jun 15, 2023

View reviewed changes

amyeroberts reviewed Jun 15, 2023

View reviewed changes

amyeroberts approved these changes Jun 15, 2023

View reviewed changes

Rocketknight1 added 18 commits June 16, 2023 14:46

Fix one BLIP arg not being optional, remove misspelled arg

38f0f35

Remove the lxmert test overrides and just use the base test_saved_mod…

070e7b5

…el_creation

saved_model_creation fixes and re-enabling tests across the board

47661f8

Remove unnecessary skip

391af08

Stop caching sinusoidal embeddings in speech_to_text

ea8a554

Fix transfo_xl compilation

d2f2ac0

Fix transfo_xl compilation

3043739

Fix the conditionals in xglm

dce2101

Set the save spec only when building

847cc2d

Clarify comment

b2a49a7

Move comment correctly

6605bf1

Correct embeddings generation for speech2text

1b73af4

Mark RAG generation tests as @slow

ec2e6d4

Remove redundant else:

19d19a0

Add comment to clarify the save_spec line in build()

a8bc88f

Fix size tests for XGLM at last!

a81309c

make fixup

51388f9

Remove one band_part operation

c1e0172

Rocketknight1 force-pushed the big_tf_test_cleanup branch from 8428907 to c1e0172 Compare June 16, 2023 13:49

Mark test_keras_fit as @slow

74bac13

Rocketknight1 merged commit 3403712 into main Jun 16, 2023

Rocketknight1 deleted the big_tf_test_cleanup branch June 16, 2023 14:40

		# Set the serving spec quickly to ensure that Keras doesn't use the specific dummy input shapes as the spec
		self._set_save_spec(self._prune_signature(self.input_signature))

Big TF test cleanup #24282

Big TF test cleanup #24282

Conversation

Rocketknight1 commented Jun 14, 2023

HuggingFaceDocBuilderDev commented Jun 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante left a comment

Choose a reason for hiding this comment

gante commented Jun 14, 2023

ydshieh left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Rocketknight1 commented Jun 16, 2023

ydshieh commented Jun 16, 2023

ydshieh commented Jun 16, 2023

HuggingFaceDocBuilderDev commented Jun 14, 2023 •

edited

Loading

ydshieh left a comment •

edited

Loading