-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TGI latest cpu version doesn't work with some models #625
Comments
what is the issue you are facing, can you please post error log from docker here. |
I'm using helm install to test: |
Error message/pod logs: {"timestamp":"2024-08-19T05:38:39.361300Z","level":"INFO","fields":{"message":"Args {\n model_id: "ise-uiuc/Magicoder-S-DS-6.7B",\n revision: None,\n validation_workers: 2,\n sharded: None,\n num_shard: None,\n quantize: None,\n speculate: None,\n dtype: None,\n trust_remote_code: false,\n max_concurrent_requests: 128,\n max_best_of: 2,\n max_stop_sequences: 4,\n max_top_n_tokens: 5,\n max_input_tokens: None,\n max_input_length: None,\n max_total_tokens: None,\n waiting_served_ratio: 0.3,\n max_batch_prefill_tokens: None,\n max_batch_total_tokens: None,\n max_waiting_tokens: 20,\n max_batch_size: None,\n cuda_graphs: None,\n hostname: "tgi-874bfcffc-c4wst",\n port: 2080,\n shard_uds_path: "/tmp/text-generation-server",\n master_addr: "localhost",\n master_port: 29500,\n huggingface_hub_cache: Some(\n "/data",\n ),\n weights_cache_override: None,\n disable_custom_kernels: false,\n cuda_memory_fraction: 1.0,\n rope_scaling: None,\n rope_factor: None,\n json_output: true,\n otlp_endpoint: None,\n otlp_service_name: "text-generation-inference.router",\n cors_allow_origin: [],\n api_key: None,\n watermark_gamma: None,\n watermark_delta: None,\n ngrok: false,\n ngrok_authtoken: None,\n ngrok_edge: None,\n tokenizer_config_path: None,\n disable_grammar_support: false,\n env: false,\n max_client_batch_size: 4,\n lora_adapters: None,\n usage_stats: On,\n}"},"target":"text_generation_launcher"} |
=> can this be closed? |
Signed-off-by: letonghan <[email protected]> Signed-off-by: chensuyue <[email protected]>
After updated tgi version to
ghcr.io/huggingface/text-generation-inference:latest-intel-cpu
The codegen test failed with the following 2 MODELs:
ise-uiuc/Magicoder-S-DS-6.7B
m-a-p/OpenCodeInterpreter-DS-6.7B
The later one is mentioned in the readme file of CodeGen:
https://github.com/opea-project/GenAIExamples/tree/main/CodeGen
The default model(meta-llama/CodeLlama-7b-hf) specified by docker-compose runs fine.
The text was updated successfully, but these errors were encountered: