Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory usage in TF building #24046

Merged
merged 3 commits into from
Jun 6, 2023
Merged

Conversation

Rocketknight1
Copy link
Member

This PR reduces the default shape of dummy inputs from (3, 3) to (2, 2). This slightly reduces the memory usage when building TF models, which should hopefully fix some of our pipeline tests.

We could replace the dummy inputs with symbolic tensors, which would mean we could build TF models with 0 memory usage, but this would make TF model building slower (~4X) because it would implicitly compile the model when building, which is probably not an acceptable tradeoff.

cc @ydshieh and @amyeroberts as core maintainer

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 6, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the change itself is good for me.

@ydshieh
Copy link
Collaborator

ydshieh commented Jun 6, 2023

Let me run it on CI and see.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change LGTM - thanks for updating!

Happy to merge once @ydshieh gives the 👍 from CI runs

@@ -1116,16 +1116,16 @@ def dummy_inputs(self) -> Dict[str, tf.Tensor]:
dummies = {}
sig = self._prune_signature(self.input_signature)
for key, spec in sig.items():
# 3 is the most correct arbitrary size. I will not be taking questions
dummies[key] = tf.ones(shape=[dim if dim is not None else 3 for dim in spec.shape], dtype=spec.dtype)
# 2 is the most correct arbitrary size. I will not be taking questions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish to file this diff as evidence to the contrary #team3

@Rocketknight1
Copy link
Member Author

Sorry for the delay - there's an issue with Funnel that wasn't reproducing on my machine. I eventually figured out that the problem is the classic TF one: indices for tf.gather are not validated on GPU but are validated on CPU, and so the bug only becomes apparent on CPU. Will fix in just a sec!

@ydshieh
Copy link
Collaborator

ydshieh commented Jun 6, 2023

I also tried to run the change in this PR, and got

FAILED tests/pipelines/test_pipelines_common.py::PipelineUtilsTest::test_load_default_pipelines_tf - tensorflow.python.framework.errors_impl.ResourceExhaustedError: {{function_node __wrapped__Transpose_device_/job:localhost/replica:0/task:0/device:GPU:0}} OOM when allocating tensor with shape[768,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:Transpose]
FAILED tests/pipelines/test_pipelines_common.py::PipelineUtilsTest::test_load_default_pipelines_tf_table_qa - tensorflow.python.framework.errors_impl.ResourceExhaustedError: Exception encountered when calling layer 'tapas' (type TFTapasMainLayer).

{{function_node __wrapped__StatelessTruncatedNormalV2_device_/job:localhost/replica:0/task:0/device:GPU:0}} OOM when allocating tensor with shape[30522,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:StatelessTruncatedNormalV2]

Call arguments received by layer 'tapas' (type TFTapasMainLayer):
  • input_ids=tf.Tensor(shape=(2, 2), dtype=int32)
  • attention_mask=tf.Tensor(shape=(2, 2), dtype=float32)
  • token_type_ids=tf.Tensor(shape=(2, 2, 7), dtype=int32)
  • position_ids=None
  • head_mask=None
  • inputs_embeds=None
  • output_attentions=False
  • output_hidden_states=False
  • return_dict=True
  • training=False

and 5 other ones (probably due to the above one).

@Rocketknight1 I think we will have to reiterate (change->run->change->run) a bit more before we merge.

@Rocketknight1
Copy link
Member Author

Yep, working on it now!

@ydshieh
Copy link
Collaborator

ydshieh commented Jun 6, 2023

The tests/pipelines/test_pipelines_common.py::PipelineUtilsTest::test_load_default_pipelines_tf run against a list of models, so it's kind normal it fails with other models even some fixes are done previously.

I am OK to trigger the run (a subset) whenever you feel it's time. Otherwise I can show you a modified workflow file for you to trigger manually.

@Rocketknight1
Copy link
Member Author

@ydshieh the issues with Funnel have been resolved, so this should be ready for a CI run now!

@ydshieh
Copy link
Collaborator

ydshieh commented Jun 6, 2023

You can watch it live here. It will take 20-30 min to finish.

@Rocketknight1
Copy link
Member Author

Looks like they're still failing even with very small dummies. I'll investigate those models and try to figure out why - the new dummies should be smaller than the old ones!

@Rocketknight1
Copy link
Member Author

Maybe this is a sign that we should transition the dummies to symbolic tensors for those models, even if it's probably too slow for our tests to do it across the whole codebase.

@Rocketknight1 Rocketknight1 merged commit 7203ea6 into main Jun 6, 2023
@Rocketknight1 Rocketknight1 deleted the lower_dummy_memory_usage branch June 6, 2023 17:29
novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023
* Make the default dummies (2, 2) instead of (3, 3)

* Fix for Funnel

* Actually fix Funnel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants