Error When Accessing Azure Serverless Models #4771
-
What happened?When attempting to utilize a serverless model in Azure AI Foundry (Azure OpenAI), I am receiving a 401 error, even though the the API key is verified and correct. Config: endpoints:
azureOpenAI:
titleModel: 'Meta-Llama-3.1-8B-Instruct'
plugins: false
assistants: false
groups:
- group: 'Llama'
serverless: true
apiKey: '<REDACTED>'
baseURL: 'https://<REDACTED>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview'
models:
Meta-Llama-3.1-8B-Instruct: true I tried the Steps to Reproduce
What browsers are you seeing the problem on?No response Relevant log outputerror: [handleAbortError] AI response error; aborting request: Failed to send message. HTTP 401 - { "statusCode": 401, "message": "Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired." } ScreenshotsNo response Code of Conduct
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
I'm having trouble with Azure serverless models even with raw (cURL) API requests, including all the code examples that Azure gives |
Beta Was this translation helpful? Give feedback.
-
Figured it out after running a couple of different tests. All their documentations say to use Auth Bearer for serverless inference requests (source):
however, we need to use I will make some changes to account for this, as well as supporting |
Beta Was this translation helpful? Give feedback.
I'm updating the docs, but the update is now live:
Here is an example configuration for Meta-Llama-3.1-8B-Instruct:
Notes:
/models/chat/completions?api-version=version
for serverless inference.baseURL
field should be set to the root of the endpoint, without anythi…