Custom LLM Class: Overriding completionWithRetry in ChatOpenAI to Use a Custom REST Endpoint #7709

cptnanon · 2025-02-17T10:17:07Z

cptnanon
Feb 17, 2025

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

import { 
  ChatOpenAI, 
  type ChatOpenAICallOptions, 
  OpenAIClient, 
  OpenAICoreRequestOptions 
} from "@langchain/openai";
import { MyCustomHttpClient } from "./my-custom-http-client"; 

// Custom fields interface that extends ChatOpenAICallOptions
interface CustomChatOpenAIFields extends ChatOpenAICallOptions {
  httpClient: MyCustomHttpClient;
}

export class CustomChatOpenAI extends ChatOpenAI {
  customHttpClient: MyCustomHttpClient;

  constructor(fields: CustomChatOpenAIFields) {
    super(fields);
    this.customHttpClient = fields.httpClient;
  }

  // Overload for streaming requests:
  async completionWithRetry(
    request: OpenAIClient.Chat.ChatCompletionCreateParamsStreaming,
    options: OpenAICoreRequestOptions,
  ): Promise<AsyncIterable<OpenAIClient.Chat.Completions.ChatCompletionChunk>>;

  // Overload for non-streaming requests:
  async completionWithRetry(
    request: OpenAIClient.Chat.ChatCompletionCreateParamsNonStreaming,
    options: OpenAICoreRequestOptions,
  ): Promise<OpenAIClient.Chat.Completions.ChatCompletion>;

  // Implementation:
  async completionWithRetry(
    request:
      | OpenAIClient.Chat.ChatCompletionCreateParamsStreaming
      | OpenAIClient.Chat.ChatCompletionCreateParamsNonStreaming,
    options: OpenAICoreRequestOptions,
  ): Promise<
    AsyncIterable<OpenAIClient.Chat.Completions.ChatCompletionChunk> |
    OpenAIClient.Chat.Completions.ChatCompletion
  > {
    
    // Use our custom HTTP client to call our custom REST endpoint.
    const result = await this.customHttpClient.post("/custom-endpoint", request, options);

    // For streaming requests
    if (request.stream) {
      async function* stream(iterable: AsyncIterable<any>): AsyncIterable<OpenAIClient.Chat.Completions.ChatCompletionChunk> {
        for await (const chunk of iterable) {
          yield chunk;
        }
      }
      return stream(result as AsyncIterable<any>);
    }
    
    return result as OpenAIClient.Chat.Completions.ChatCompletion;
  }
}

Description

We have a custom proxied endpoints for our LLMs (OpenAI, Gemini & Anthropic) that requires its own authentication and specialized request handling. The official LangChain.js documentation recommends creating custom LLMs by overriding the _generate method. However, doing so forces us to reimplement various utility functions to convert between internal message formats and client responses.

To avoid duplicating all this conversion logic, we're exploring an alternative approach: overriding the completionWithRetry method in a subclass of ChatOpenAI. By doing so, we can simply swap out the HTTP request logic to call our custom, authenticated endpoint, while still leveraging all of LangChain.js’s built-in message conversion utilities and callback integrations.

Additionally, we're wondering whether this approach is applicable to other models (such as Gemini) that might require similar customizations, or if there are any potential pitfalls (e.g., streaming, token tracking, or callback compatibility issues) with deviating from the recommended _generate override.

My Questions:

Has anyone used this approach—overriding completionWithRetry rather than _generate—for integrating a custom, authenticated REST endpoint?
Are there any pitfalls or compatibility issues with streaming, token tracking, or callbacks when using this method?
Would the maintainers recommend this strategy over implementing _generate, given that it avoids duplicating message conversion logic?
Is this approach extendable to other models (e.g., Gemini), or are there model-specific considerations we should be aware of?

System Info

[email protected]

@cptnanon · 2025-02-17T10:19:11Z

dosubot[bot]
bot Feb 17, 2025

Hey there, @cptnanon! I'm here to help you with any bugs, questions, or contributions you have. Let's tackle this together while we wait for a human maintainer.

Using completionWithRetry for Custom Endpoints: Overriding completionWithRetry to integrate a custom REST endpoint is a viable approach. This method allows you to leverage LangChain.js's built-in utilities for message conversion and callbacks without duplicating logic. It is designed to handle both streaming and non-streaming requests, making it suitable for custom integrations ^[1].
Pitfalls and Compatibility Issues: When overriding completionWithRetry, be mindful of potential issues with streaming, token tracking, and callbacks. The method includes retry logic and handles both streaming and non-streaming requests, but you should ensure that your custom HTTP client supports these features. Additionally, verify that token usage is accurately tracked and that callbacks are properly integrated ^[2].
Recommendation Over _generate: While overriding _generate is the recommended approach for creating custom LLMs due to its comprehensive handling of message conversion, token tracking, and streaming, using completionWithRetry can be a simpler alternative if you want to avoid duplicating conversion logic. However, ensure that your custom implementation maintains compatibility with LangChain.js's callback and token tracking mechanisms ^[3].
Extendability to Other Models: This approach can be extended to other models, such as Gemini, but be aware of model-specific requirements. For instance, LangChain.js uses the @langchain/google-genai package for Gemini models, which includes specific configurations and safety handling. Ensure that your custom implementation aligns with the model's requirements and handles any unique features or limitations ^[4]^[5].

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other}

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom LLM Class: Overriding completionWithRetry in ChatOpenAI to Use a Custom REST Endpoint #7709

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Custom LLM Class: Overriding completionWithRetry in ChatOpenAI to Use a Custom REST Endpoint #7709

cptnanon Feb 17, 2025

Checked other resources

Commit to Help

Example Code

Description

System Info

Replies: 1 comment

dosubot[bot] bot Feb 17, 2025

cptnanon
Feb 17, 2025

dosubot[bot]
bot Feb 17, 2025