obj_size <= remaining_buffer_size #680

qzq-123 · 2025-01-20T08:12:24Z

Does anyone know what obj_size and remaining_buffer_size refer to? Where can I adjust them?

Container startup parameters
docker run --rm -it --net host --shm-size=20g \ --ulimit memlock=-1 --ulimit stack=67108864 --gpus all

I have an A5000 gpu, running the Qwen2.5-3B-Instruct model,
python3 ../run.py --input_text "Hello, what is your name?" --max_output_len=50 --tokenizer_dir ./tmp/Qwen/3B/ --engine_dir=./tmp/Qwen/3B/trt_engines/int4_weight_only/1-gpu/
I can get normal results.

But starting the backend service reported an error
python3 /tensorrtllm_backend/scripts/launch_triton_server.py --world_size=1 --model_repo=${MODEL_FOLDER}

The text was updated successfully, but these errors were encountered:

mutkach · 2025-01-28T12:46:49Z

I've encountered the same problem with MLlama.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

obj_size <= remaining_buffer_size #680

obj_size <= remaining_buffer_size #680

qzq-123 commented Jan 20, 2025 •

edited

Loading

mutkach commented Jan 28, 2025

obj_size <= remaining_buffer_size #680

obj_size <= remaining_buffer_size #680

Comments

qzq-123 commented Jan 20, 2025 • edited Loading

mutkach commented Jan 28, 2025

qzq-123 commented Jan 20, 2025 •

edited

Loading