feat: Image input (gpt-4 vision preview) support #121

yiwufen · 2024-11-25T09:01:57Z

Inquiry About Image Input Support (e.g., Using GPT-4 Vision Preview API)

Description

Hello,

I am currently using your framework for project development and would like to know if it supports image input functionalities. Specifically, I am interested in whether it's possible to integrate and utilize OpenAI's GPT-4 Vision Preview API for processing and analyzing image data.

Questions

Image Input Support
Does the framework have built-in modules or functionalities that support image inputs?
Usage Examples or Documentation
If supported, could you provide relevant usage examples or links to documentation to guide me on how to integrate and use this feature?
Future Feature Plans
If image input is not currently supported, are there any plans to include this functionality in future releases?
Integration with Third-Party Libraries
If the framework does not support image inputs at the moment, do you have any recommended methods or third-party libraries that can be easily integrated to add image input capabilities?

Additional Information

To better meet my project requirements, I aim to leverage both image processing capabilities and existing text processing features. If there are any example codes or best practices available, I would greatly appreciate it if you could share them.

Thank you for your assistance!

0xMochan · 2024-11-28T20:47:52Z

We do not currently have any multimodal inputs.
There definitely are plans but it would require a greater in-depth look at our current API to figure out where we can fit it. There are other features that are a bit higher on our TODO list.
You could look into directly working with something like burn (which we plan on deeper integrations in the future). This is a much lower level library but should have examples that could get you going!

Multi-modal rig agents are definitely an important feature on our minds but it'll take a coordinated effort to build a suitable and elegant API for it!

mateobelanger added the feature request label Nov 25, 2024

mateobelanger changed the title ~~feat: <title>~~ feat: Image input (gpt-4 vision preview) support Nov 25, 2024

cvauclair mentioned this issue Dec 9, 2024

feat: Update completion API Message type #146

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Image input (gpt-4 vision preview) support #121

feat: Image input (gpt-4 vision preview) support #121

yiwufen commented Nov 25, 2024

0xMochan commented Nov 28, 2024

feat: Image input (gpt-4 vision preview) support #121

feat: Image input (gpt-4 vision preview) support #121

Comments

yiwufen commented Nov 25, 2024

Inquiry About Image Input Support (e.g., Using GPT-4 Vision Preview API)

Description

Questions

Additional Information

0xMochan commented Nov 28, 2024