From 81cfd8d0ad9aced972394a77e77b7d687768e33d Mon Sep 17 00:00:00 2001 From: Jacob Cable <32874567+cabljac@users.noreply.github.com> Date: Fri, 29 Nov 2024 12:54:41 +0000 Subject: [PATCH] docs(firestore-multimodal-genai): update docs (#602) --- firestore-multimodal-genai/POSTINSTALL.md | 10 ++++++---- firestore-multimodal-genai/PREINSTALL.md | 15 ++++++++++++--- firestore-multimodal-genai/README.md | 15 ++++++++++++--- 3 files changed, 30 insertions(+), 10 deletions(-) diff --git a/firestore-multimodal-genai/POSTINSTALL.md b/firestore-multimodal-genai/POSTINSTALL.md index 76e0e2bf..d6877806 100644 --- a/firestore-multimodal-genai/POSTINSTALL.md +++ b/firestore-multimodal-genai/POSTINSTALL.md @@ -40,13 +40,15 @@ For Vertex AI, the list of models is [here](https://cloud.google.com/vertex-ai/d #### Multimodal Prompts -Many of the Gemini models accept multimodal prompts. This extension allows for multimodal prompting with images using this model. +Many of the Gemini models accept multimodal prompts. This extension allows for multimodal prompting with images using such models. Note that this feature is not supported for models such as `gemini-1.0-pro` which do not allow multimodal prompts. -On installation you may pick an `image` field. The image field must be the Cloud Storage URL of an object (e.g `gs://my-bucket.appspot.com/filename.png`). This image will then be provided as part of the prompt to Gemini Pro Vision. +On installation you may pick an `image` field. The image field must be the Cloud Storage URL of an object (e.g `gs://my-bucket.appspot.com/filename.png`). This image will then be provided as part of the prompt. -Note that Google AI requires prompts to have both an image and text part, whereas Vertex AI allows gemini-pro-vision to be prompted with text only as well. +##### Gemini Pro Vision (deprecated) -If you have selected to use the Gemini Pro Vision model (deprecated) and have Google AI as a provider then any document handled by the extension must contain an image field. +This extension has historically supported calls to the (now deprecated) Gemini Pro Vision model on Google AI and Vertex AI APIs. + +For the Gemini Pro Vision models Google AI requires prompts to have both an image and text part, whereas Vertex AI allows gemini-pro-vision to be prompted with text only as well. If you have selected to use the Gemini Pro Vision model (deprecated) and have Google AI as a provider then any document handled by the extension must contain an image field. The Gemini Pro Vision API has a limit on image sizes. For Google AI this limit is currently 1MB, and for Vertex AI this limit is 4MB. This extension compress and resize images that fall above this limit. diff --git a/firestore-multimodal-genai/PREINSTALL.md b/firestore-multimodal-genai/PREINSTALL.md index f61f7e88..81a77d5e 100644 --- a/firestore-multimodal-genai/PREINSTALL.md +++ b/firestore-multimodal-genai/PREINSTALL.md @@ -66,11 +66,20 @@ For Vertex AI, the list of models is [here](https://cloud.google.com/vertex-ai/d #### Multimodal Prompts -This extension supports providing multimodal prompts. To use this feature, select the Gemini Pro Vision model on installation, and provide an Image Field parameter. The Image Field parameter should be the name of a document field in firestore. +Many Gemini models, such as **Gemini 1.5 Flash**, support multimodal prompts, allowing both text and image inputs. This feature is not supported by text-only models like `gemini-1.0-pro`. -When you select these options, any document handled by the extension must contain an image field. The image field must be a string, and can either be the Cloud Storage URL of an object (e.g `gs://my-bucket.appspot.com/filename.png`). This image will then be provided as part of the prompt to Gemini Pro Vision. +**Image Field Configuration:** +During installation, you may specify an **Image Field**. This installation parameter is a string which corresponds to a field in Cloud Firestore documents. -The Gemini Pro Vision API has a limit on image sizes. For Google AI this limit is currently 1MB, and for Vertex AI this limit is 4MB. This extension will compress and resize images that fall above this limit. +When you select these options, any document handled by the extension must contain an image field. The image field must be a string, and can either be the Cloud Storage URL of an object (e.g `gs://my-bucket.appspot.com/filename.png`). + +##### Gemini Pro Vision (deprecated) + +This extension has historically supported calls to the (now deprecated) Gemini Pro Vision model on Google AI and Vertex AI APIs. + +For the Gemini Pro Vision models Google AI requires prompts to have both an image and text part, whereas Vertex AI allows gemini-pro-vision to be prompted with text only as well. If you have selected to use the Gemini Pro Vision model (deprecated) and have Google AI as a provider then any document handled by the extension must contain an image field. + +The Gemini Pro Vision API has a limit on image sizes. For Google AI this limit is currently 1MB, and for Vertex AI this limit is 4MB. This extension compress and resize images that fall above this limit. ### Troubleshooting timeout/PROCESSING errors diff --git a/firestore-multimodal-genai/README.md b/firestore-multimodal-genai/README.md index 8ee899fd..ee97e31b 100644 --- a/firestore-multimodal-genai/README.md +++ b/firestore-multimodal-genai/README.md @@ -74,11 +74,20 @@ For Vertex AI, the list of models is [here](https://cloud.google.com/vertex-ai/d #### Multimodal Prompts -This extension supports providing multimodal prompts. To use this feature, select the Gemini Pro Vision model on installation, and provide an Image Field parameter. The Image Field parameter should be the name of a document field in firestore. +Many Gemini models, such as **Gemini 1.5 Flash**, support multimodal prompts, allowing both text and image inputs. This feature is not supported by text-only models like `gemini-1.0-pro`. -When you select these options, any document handled by the extension must contain an image field. The image field must be a string, and can either be the Cloud Storage URL of an object (e.g `gs://my-bucket.appspot.com/filename.png`). This image will then be provided as part of the prompt to Gemini Pro Vision. +**Image Field Configuration:** +During installation, you may specify an **Image Field**. This installation parameter is a string which corresponds to a field in Cloud Firestore documents. -The Gemini Pro Vision API has a limit on image sizes. For Google AI this limit is currently 1MB, and for Vertex AI this limit is 4MB. This extension will compress and resize images that fall above this limit. +When you select these options, any document handled by the extension must contain an image field. The image field must be a string, and can either be the Cloud Storage URL of an object (e.g `gs://my-bucket.appspot.com/filename.png`). + +##### Gemini Pro Vision (deprecated) + +This extension has historically supported calls to the (now deprecated) Gemini Pro Vision model on Google AI and Vertex AI APIs. + +For the Gemini Pro Vision models Google AI requires prompts to have both an image and text part, whereas Vertex AI allows gemini-pro-vision to be prompted with text only as well. If you have selected to use the Gemini Pro Vision model (deprecated) and have Google AI as a provider then any document handled by the extension must contain an image field. + +The Gemini Pro Vision API has a limit on image sizes. For Google AI this limit is currently 1MB, and for Vertex AI this limit is 4MB. This extension compress and resize images that fall above this limit. ### Troubleshooting timeout/PROCESSING errors