Deduplicate logging errors for "model not found" #65

cvarrei · 2024-07-25T09:16:48Z

Describe the bug
The fact that the script checks each time it calculates the impact whether a model exists for a specific provider leads to Ecologits frantically prints the following message: ‘Could not find model “xxxxx” for provider xxxx’ when the model/ provider is not implemented or when the correct tracer is not uniquely instantiated (see reason 2).

The message come from the llm_impacts function from utils.py:

def llm_impacts(
    provider: str,
    model_name: str,
    output_token_count: int,
    request_latency: float,
) -> Optional[Impacts]:
    """
    High-level function to compute the impacts of an LLM generation request.

    Args:
        provider: Name of the provider.
        model_name: Name of the LLM used.
        output_token_count: Number of generated tokens.
        request_latency: Measured request latency in seconds.

    Returns:
        The impacts of an LLM generation request.
    """
    model = models.find_model(provider=provider, model_name=model_name)
    if model is None:
        # TODO: Replace with proper logging
        print(f"Could not find model `{model_name}` for {provider} provider.")
        return None
    model_active_params = model.active_parameters \
                          or Range(min=model.active_parameters_range[0], max=model.active_parameters_range[1])
    model_total_params = model.total_parameters \
                         or Range(min=model.total_parameters_range[0], max=model.total_parameters_range[1])
    return compute_llm_impacts(
        model_active_parameter_count=model_active_params,
        model_total_parameter_count=model_total_params,
        output_token_count=output_token_count,
        request_latency=request_latency
    )

Reason 1: In an architecture where the user would like to include a model that is not yet implemented in ecologits, it will spam the logs with this message.

Reason 2: This behaviour is also present when using the LiteLLM integration for MistralAI models and some other providers (e.g. Groq). LiteLLM relies on a OpenAI configuration for these providers, which also instanciate the OpenAI tracer, so: for ‘mistral/open-mistral-nemo-2407’ we get ‘Could not find model open-mistral-nemo-2407 for openai provider’ https://github.com/BerriAI/litellm/blob/b376ee71b01e3e8c6453a3dd21421b365aaaf9f8/litellm/llms/openai.py

To Reproduce
Using Mistral models with the LiteLLM tracer ( mistral/open-mistral-nemo-2407) or an unimplemented provider and/or model (e.g., groq/llama3-8b-8192) . The behaviour is more pronounced in stream mode (the message is printed for each token).

Proposed solution

To avoid this interaction between the OpenAI tracer and the LiteLLM tracer, adopting the LiteLLM tracer strategy as the default strategy for finding the provider could be useful.: provider=models.find_provider(model_name=model_name).
For the message, I suggest not displaying it when no model is returned or adding a verbosity parameter when initialising the Ecologits object (by default, False) to give the user the choice of whether or not to display this message.

The text was updated successfully, but these errors were encountered:

adrienbanse · 2024-07-25T09:25:08Z

Hey @cvarrei, many thanks for reporting this!

I agree with the first proposed solution as a quick fix, although we may think of a more in-depth solution in the near future for handling the difference between providers (such as Mistral in this case) and clients (such as LiteLLM here).

Concerning the second solution, a verbose initialisation parameter is a good solution. It is planned anyway to add initialisation parameters soon such as the electricty mix, so we may include both in the same PR.

cvarrei · 2024-07-25T09:57:06Z

A solution for the first case as more in-depth solution , which simplifies the task and merges with the initialisation parameters PR, could be to require the user to specify, at initialisation, the tracer he will be using, for example

# Initialize EcoLogits
EcoLogits.init(tracer='litellm')

response = litellm.completion(
    model="gpt-4o-2024-05-13",
    messages=[{ "content": "Hello, how are you?","role": "user"}]
)

And by modifying the init() from ecologits.py:

class EcoLogits:
    initialized = False

    def init(tracer:str) -> None:
        """Initialization static method."""
        if not EcoLogits.initialized:
           get_instrumentor(tracer=tracer)
            EcoLogits.initialized = True
    
   
def get_instrumentor(tracer:str)-> callable:
        dict_instrumentor = {"openai": init_openai_instrumentor,
                            "anthropic": init_anthropic_instrumentor,
                            "mistralai":init_mistralai_instrumentor,
                            "huggingface_hub": init_huggingface_instrumentor,
                            "cohere": init_cohere_instrumentor,
                            "google": init_google_instrumentor,
                            "litellm":init_litellm_instrumentor
                            }
        return dict_instrumentor[tracer]()

Thus, you can remove the init_instruments() and importlib find_spec function.

In this way, there is no automatic initialisation depending on the module installed, but only depending on where the user wants to integrate it. With this setting, there is no unwanted secondary initialization (as was the case with google and google-generativeai). This may be better suited to an environment where multiple models are used or when using a multi-providers client (like LiteLLM).

samuelrince · 2024-08-27T15:42:30Z

Two things here:

Choosing which provider(s) you want to enable is a good idea. I am not sure if we should keep it in "eager mode" or not (i.e. init any available providers by default or not). It's a design / UX choice more than a technical one.
We may need to filter/dedup logs and thus configure a basic logger in the library to avoid the spam.

adrienbanse · 2024-09-11T14:41:07Z

@samuelrince @cvarrei Normally a proper logger has been implemented in #78, thereby closing this issue but please re-open it if you consider that the problem is not solved.

This was referenced Jul 25, 2024

Support regional energy mixes #42

Closed

Configuration system #57

Closed

samuelrince mentioned this issue Aug 28, 2024

Fix initialization of google-generativeai #76

Closed

samuelrince added the bug Something isn't working label Aug 28, 2024

samuelrince mentioned this issue Aug 28, 2024

Add provider selection and fix google-generativeai init #77

Merged

samuelrince changed the title ~~Bug -- Frenetically printing of "Could not find '....' for .... provider"~~ Deduplicate logging errors for "model not found" Aug 29, 2024

adrienbanse linked a pull request Sep 5, 2024 that will close this issue

Configuration system implementation #78

Merged

3 tasks

samuelrince mentioned this issue Sep 9, 2024

Configuration system implementation #78

Merged

3 tasks

adrienbanse closed this as completed in #78 Sep 11, 2024

samuelrince mentioned this issue Dec 13, 2024

Add warnings and errors to ImpactsOutput #99

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deduplicate logging errors for "model not found" #65

Deduplicate logging errors for "model not found" #65

cvarrei commented Jul 25, 2024

adrienbanse commented Jul 25, 2024

cvarrei commented Jul 25, 2024

samuelrince commented Aug 27, 2024 •

edited by adrienbanse

Loading

adrienbanse commented Sep 11, 2024

Deduplicate logging errors for "model not found" #65

Deduplicate logging errors for "model not found" #65

Comments

cvarrei commented Jul 25, 2024

adrienbanse commented Jul 25, 2024

cvarrei commented Jul 25, 2024

samuelrince commented Aug 27, 2024 • edited by adrienbanse Loading

adrienbanse commented Sep 11, 2024

samuelrince commented Aug 27, 2024 •

edited by adrienbanse

Loading