Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Chatbot examples to show more than one image #10111

Merged
merged 28 commits into from
Dec 13, 2024

Conversation

hannahblair
Copy link
Collaborator

@hannahblair hannahblair commented Dec 3, 2024

Description

This PR allows the Chatbot to display more files and info in the examples area and adds some design tweaks. Let me know what you think!

Screenshot 2024-12-05 at 00 28 51 Screenshot 2024-12-05 at 00 28 58

Small question, should we cascade the image icons as well like we do for the files? 🤔

Closes: #10111

🎯 PRs Should Target Issues

Before your create a PR, please check to see if there is an existing issue for this change. If not, please create an issue before you create this PR, unless the fix is very small.

Not adhering to this guideline will result in the PR being closed.

Testing and Formatting Your Code

  1. PRs will only be merged if tests pass on CI. We recommend at least running the backend tests locally, please set up your Gradio environment locally and run the backed tests: bash scripts/run_backend_tests.sh

  2. Please run these bash scripts to automatically format your code: bash scripts/format_backend.sh, and (if you made any changes to non-Python files) bash scripts/format_frontend.sh

@gradio-pr-bot
Copy link
Collaborator

gradio-pr-bot commented Dec 3, 2024

🪼 branch checks and previews

Name Status URL
Spaces ready! Spaces preview
Website ready! Website preview
Storybook ready! Storybook preview
🦄 Changes detected! Details

Install Gradio from this PR

pip install https://gradio-pypi-previews.s3.amazonaws.com/2e228995203dd957cb6a59591605058726d7708d/gradio-5.8.0-py3-none-any.whl

Install Gradio Python Client from this PR

pip install "gradio-client @ git+https://github.com/gradio-app/gradio@2e228995203dd957cb6a59591605058726d7708d#subdirectory=client/python"

Install Gradio JS Client from this PR

npm install https://gradio-npm-previews.s3.amazonaws.com/2e228995203dd957cb6a59591605058726d7708d/gradio-client-1.8.0.tgz

Use Lite from this PR

<script type="module" src="https://gradio-lite-previews.s3.amazonaws.com/2e228995203dd957cb6a59591605058726d7708d/dist/lite.js""></script>

@gradio-pr-bot
Copy link
Collaborator

gradio-pr-bot commented Dec 3, 2024

🦄 change detected

This Pull Request includes changes to the following packages.

Package Version
@gradio/chatbot minor
gradio minor
  • Maintainers can select this checkbox to manually select packages to update.

With the following changelog entry.

Allow Chatbot examples to show more than one image

Maintainers or the PR author can modify the PR title to modify this entry.

Something isn't right?

  • Maintainers can change the version label to modify the version bump.
  • If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

@hannahblair hannahblair marked this pull request as ready for review December 5, 2024 00:47
@abidlabs
Copy link
Member

abidlabs commented Dec 5, 2024

Wow this looks splendid @hannahblair!

My one hesitation is the addition of the "Aa" icon on top of examples without any files. I do like that it makes all of the examples the same vertical height. However, it feels quite unnecessary when all of the examples only consist of text, which is probably the most common case (e.g. see demo/chatinterface_artifacts or demo/chatinterface_multimodal).

Idea: what if we only show this "Aa" icon if multimodal=True? Examples with files are more likely to occur when multimodal=True, in which case having this icon is nice to have for the sake of clarity.

Small question, should we cascade the image icons as well like we do for the files? 🤔

I don't think we should cascade the image icons. For many VLMs, examples will often consist of a pair of images along with a question like, "what's different between these images", in which case, it'll help to be able to see the images clearly before clicking on the example.

For the sake of consistency, I'd suggest we also not cascade the file icons, but open to other folks thoughts about both of these points. cc @aliabid94 @dawoodkhan82 @allisonwhilden

@dawoodkhan82
Copy link
Collaborator

This is clean!

Agree on not cascading the file icons. Also maybe we can show a different icon for audio and video files, like we do in the multimodal textbox. Or even nicer would be to show a thumbnail for video files.

@abidlabs
Copy link
Member

abidlabs commented Dec 5, 2024

Agree with @dawoodkhan82 about icons for audio and video files, it'd be good to have consistent icons in the chatbot examples and the multimodal textbox:

image

@hannahblair
Copy link
Collaborator Author

Yep I agree, that's a great idea!

@abidlabs
Copy link
Member

abidlabs commented Dec 6, 2024

Not sure if this PR is 100% ready @hannahblair but I noticed that there's a gap on top of the the examples when mutlimodal=False:

image

(see demo/chatinterface_multimodal/run.py)

@hannahblair
Copy link
Collaborator Author

hannahblair commented Dec 9, 2024

I noticed that there's a gap on top of the the examples when mutlimodal=False

ok this was actually intentional but if it looks like an issue then this design has failed a bit 😂 i'll remove the gap instead. i would say it does look a little better when realistic prompts are added.

Screenshot 2024-12-09 at 13 41 16

@abidlabs
Copy link
Member

abidlabs commented Dec 9, 2024

@hannahblair not sure if I'm testing wrong, but I just tested with this modified version of chatinterface_multimodal/run.py and I can't see any of the file icons in the examples:

import gradio as gr

def echo(message, history):
    return message["text"]

demo = gr.ChatInterface(
    fn=echo,
    type="messages",
    examples=[{"text": "hello", "files": ["files/avatar.png"]}, {"text": "hola", "files": ["files/avatar.png", "files/cantina.wav"]}, {"text": "merhaba", "files": ["files/avatar.png", "files/avatar.png", "files/avatar.png"]}],
    title="Echo Bot",
    multimodal=True,
)
demo.launch()
image

Does this work for you?

@abidlabs
Copy link
Member

Sorry I'm confused @hannahblair why are adding the multimodal attribute to gr.Chatbot? Why does it need to know whether or not the ChatInterface is multimodal or not?

@abidlabs
Copy link
Member

abidlabs commented Dec 10, 2024

Ah okay so I understand the reason for creating the multimodal parameter now, but I don't think we should use this approach because its a leaky abstraction i.e. there's no reason the gr.Chatbot component should need to know about any higher-level abstraction like gr.ChatInterface or the gr.MultimodalTextbox.

There's a couple of approaches we could take instead:

(1) The gr.ChatInterface could be responsible for setting the icon of text-only examples before they are passed into gr.Chatbot. In other words, if the gr.ChatInterface is multimodal and there's a text-only example that does not have an example_icon, its example_icon automatically gets set to "Aa" icon. Then, its passed into gr.Chatbot, which renders the example as it would normally, including the "Aa" example_icon.

(2) The other possibility would to modify the logic a little bit so that instead of showing the "Aa" icon when the gr.ChatInterface is multimodal, we show it for any text-only example if there are any other examples that include files. This logic could be contained entirely inside gr.Chatbot since it knows all of the examples.

My preference is for (1) since it feels less "surprising", but (2) also makes sense as long as we document the behavior appropriately.

@hannahblair
Copy link
Collaborator Author

hannahblair commented Dec 11, 2024

The other possibility would to modify the logic a little bit so that instead of showing the "Aa" icon when the gr.ChatInterface is multimodal, we show it for any text-only example if there are any other examples that include files.

This won't work without some modification in the backend because even when multimodal=False in the demo, we still receive the examples with their files and icons if they've been passed to the Chat Interface. I'm thinking we need to strip the files from examples=self.examples_messages if multimodal=False when we pass the examples to Chatbot here:

self.chatbot = Chatbot(
        label="Chatbot",
        scale=1,
        height=200 if fill_height else None,
        type=self.type,
        autoscroll=autoscroll,
        examples=self.examples_messages
        if not self._additional_inputs_in_examples
        else None,
    )

so I think we need to take option 1 but include the changes to the examples_messages we're passing. do you see any issues with that?

@abidlabs
Copy link
Member

abidlabs commented Dec 11, 2024

This won't work without some modification in the backend because even when multimodal=False in the demo, we still receive the examples with their files and icons if they've been passed to the Chat Interface. I'm thinking we need to strip the files from examples=self.examples_messages if multimodal=False when we pass the examples to Chatbot here:

I do think option 1 is the way to go but why do we need to strip the files? There might be some niche cases where a user really wants to pass in files in their example for a non-multimodal chat interface, can we just let them do that? I can't think of any reason for it, but at the same time, it doesn't break anything if they do. On the other hand, it might be a breaking change for certain demos that might be using this feature.

"path": "",
"url": "",
"orig_name": "",
"mime_type": "text", # for internal use, not a valid mime type
Copy link
Collaborator Author

@hannahblair hannahblair Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's the backend implementation i'm using to indicate in the frontend whether the example is text only

gradio/chat_interface.py Outdated Show resolved Hide resolved
@abidlabs
Copy link
Member

Gave this a quick spin before I log off and generally things are looking good, although I didn't see the "Aa" icon above the text-only example as I expected. Also not sure if multiple files are being rendered:

image

(This is running demo/test_chatinterface_multimodal_examples/run.py)

gradio/chat_interface.py Outdated Show resolved Hide resolved
@@ -324,6 +324,7 @@ def _setup_example_messages(
examples: list[str] | list[MultimodalValue] | list[list] | None,
example_labels: list[str] | None = None,
example_icons: list[str] | None = None,
multimodal: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
multimodal: bool = False,

gradio/chat_interface.py Outdated Show resolved Hide resolved
Copy link
Member

@abidlabs abidlabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm @hannahblair looks great! Just left one minor suggestion on the backend implementation

@hannahblair hannahblair merged commit 3665e81 into main Dec 13, 2024
23 checks passed
@hannahblair hannahblair deleted the example-images-chatbot branch December 13, 2024 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants