Update `test_batched_inference_image_captioning_conditioned` #23391

ydshieh · 2023-05-16T09:31:34Z

What does this PR do?

The test tests/models/pix2struct/test_modeling_pix2struct.py::Pix2StructIntegrationTest::test_batched_inference_image_captioning_conditioned starts to fail on CI run of April 27 which includes the merged PR #23023.

@younesbelkada Could you double check if the changes in this PR are reasonable? Thank you.

HuggingFaceDocBuilderDev · 2023-05-16T09:46:50Z

The documentation is not available anymore as the PR was closed or merged.

younesbelkada

thanks for the PR;
I think that we should educate users that for text-conditioned generation we should never add special tokens to the tokenizer - as introduced in #23004
I think here the model sees An photography of<|endoftext|>' which leads it to generate weird output right after.
As users might check these tests as a reference, I think we should encode in the test the input without the special tokens (i.e. force add_special_tokens=True) and document that somewhere (I can take care of that in a follow up PR)
What do you think?

younesbelkada

But in principle, that solution LGTM as well! You can merge that and I can address what I have said later

ydshieh · 2023-05-16T10:59:43Z

@younesbelkada If you can take over this PR to avoid generate weird output right after, it would be really nice :-)

younesbelkada · 2023-05-16T11:16:25Z

The test should be now fixed, the generated text produces different output than before, probably due to #23051 that now made the model using a causal attention mask on the text decoder (which was not the case before)

ydshieh · 2023-05-16T11:17:18Z

Thanks a lot!

amyeroberts · 2023-05-16T12:19:32Z

I think that we should educate users that for text-conditioned generation we should never add special tokens to the tokenizer - as introduced in #23004

@younesbelkada The best place to do this I think is in the example docstring for the model, as this is what a lot of users will reference, and it currently doesn't do that. Could you open a PR to update this?

amyeroberts

Thanks for updating!

Changes look fine - my only concern is that the generations appear to have become worse. @younesbelkada @ydshieh do we have any other generation samples to make sure the model is behaving as expected?

younesbelkada · 2023-05-16T12:27:40Z

@younesbelkada The best place to do this I think is in the example docstring for the model, as this is what a lot of users will reference, and it currently doesn't do that. Could you open a PR to update this?

Sure yes will do!

Changes look fine - my only concern is that the generations appear to have become worse. @younesbelkada @ydshieh do we have any other generation samples to make sure the model is behaving as expected?

Yes! I was relieved since we do have the tests test_batched_inference_image_captioning & test_inference_image_captioning that still pass --> meaning that the un-conditional text generation seem to be unaffected!

ydshieh · 2023-05-16T12:46:00Z

tests/models/pix2struct/test_modeling_pix2struct.py


        predictions = model.generate(**inputs)

        self.assertEqual(
-            processor.decode(predictions[0], skip_special_tokens=True), "A picture of a stop sign that says yes."
+            processor.decode(predictions[0], skip_special_tokens=True),
+            "A picture of a stop sign with a red stop sign on it.",


Actually, seeing this image, I would say this new prediction is better than the previous one.

ydshieh · 2023-05-16T12:46:53Z

tests/models/pix2struct/test_modeling_pix2struct.py

        )

        self.assertEqual(
            processor.decode(predictions[1], skip_special_tokens=True),
-            "An photography of the Temple Bar and a few other places.",
+            "An photography of the Temple Bar and the Temple Bar.",


Not better, but I would say the previous about a few other places is not super good neither.

ydshieh · 2023-05-16T12:48:30Z

@younesbelkada The best place to do this I think is in the example docstring for the model, as this is what a lot of users will reference, and it currently doesn't do that. Could you open a PR to update this?

Sure yes will do!

I am going to merge this PR and leave @amyeroberts 's suggestion for @younesbelkada in a separate PR. Thank you for the review and the refine of this PR.

…face#23391) * fix * fix * fix test + add more docs --------- Co-authored-by: ydshieh <[email protected]> Co-authored-by: younesbelkada <[email protected]>

ydshieh added 2 commits May 16, 2023 11:25

fix

33115fa

fix

9d2fb1d

ydshieh requested review from amyeroberts and younesbelkada May 16, 2023 09:31

younesbelkada reviewed May 16, 2023

View reviewed changes

younesbelkada approved these changes May 16, 2023

View reviewed changes

fix test + add more docs

0e247f5

amyeroberts approved these changes May 16, 2023

View reviewed changes

ydshieh commented May 16, 2023

View reviewed changes

ydshieh merged commit 21741e8 into main May 16, 2023

ydshieh deleted the fix_pix2stru branch May 16, 2023 12:49

younesbelkada mentioned this pull request May 16, 2023

[Pix2Struct] Add conditional generation on docstring example #23399

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update `test_batched_inference_image_captioning_conditioned` #23391

Update `test_batched_inference_image_captioning_conditioned` #23391

ydshieh commented May 16, 2023

HuggingFaceDocBuilderDev commented May 16, 2023 •

edited

Loading

younesbelkada left a comment •

edited

Loading

younesbelkada left a comment •

edited

Loading

ydshieh commented May 16, 2023

younesbelkada commented May 16, 2023

ydshieh commented May 16, 2023

amyeroberts commented May 16, 2023

amyeroberts left a comment

younesbelkada commented May 16, 2023

ydshieh May 16, 2023

ydshieh May 16, 2023

ydshieh commented May 16, 2023

Update test_batched_inference_image_captioning_conditioned #23391

Update test_batched_inference_image_captioning_conditioned #23391

Conversation

ydshieh commented May 16, 2023

What does this PR do?

HuggingFaceDocBuilderDev commented May 16, 2023 • edited Loading

younesbelkada left a comment • edited Loading

Choose a reason for hiding this comment

younesbelkada left a comment • edited Loading

Choose a reason for hiding this comment

ydshieh commented May 16, 2023

younesbelkada commented May 16, 2023

ydshieh commented May 16, 2023

amyeroberts commented May 16, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

younesbelkada commented May 16, 2023

ydshieh May 16, 2023

Choose a reason for hiding this comment

ydshieh May 16, 2023

Choose a reason for hiding this comment

ydshieh commented May 16, 2023

Update `test_batched_inference_image_captioning_conditioned` #23391

Update `test_batched_inference_image_captioning_conditioned` #23391

HuggingFaceDocBuilderDev commented May 16, 2023 •

edited

Loading

younesbelkada left a comment •

edited

Loading

younesbelkada left a comment •

edited

Loading