[bug] Medusa example fails with vicuna 33B #2478

SoundProvider · 2024-11-21T09:09:43Z

Thank you for developing trt-llm. It's helping me a lot
I'm trying to use medusa with trt-llm, referencing this page

It's working fine with vicuna 7B and its medusa heads, with no errors at all.

However, when implementing with vicuna 33B and its trained heads, the following error occurs when executing trtllm-build
converting checkpoint with medusa was done with following result

## running script
CUDA_VISIBLE_DEVICES=${DEVICES} \
trtllm-build --checkpoint_dir /app/medusa_test/tensorrt/${TP_SIZE}-gpu \
             --gpt_attention_plugin float16 \
             --gemm_plugin float16 \
             --context_fmha enable \
             --output_dir /app/medusa_test/tensorrt_llm/${TP_SIZE}-gpu \
             --speculative_decoding_mode medusa \
             --max_batch_size ${BATCH_SIZE} \
             --max_input_len ${SEQ_LEN} \
             --max_seq_len ${SEQ_LEN} \
             --max_num_tokens ${SEQ_LEN} \
             --workers ${TP_SIZE}

concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 244, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/usr/lib/python3.10/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'MedusaConfig.__init__.<locals>.GenericMedusaConfig'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/commands/build.py", line 437, in parallel_build
    future.result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 244, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/usr/lib/python3.10/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'MedusaConfig.__init__.<locals>.GenericMedusaConfig'

The text was updated successfully, but these errors were encountered:

hello-11 · 2024-11-22T08:50:03Z

@SoundProvider, could you also show the command to convert the checkpoint?

SoundProvider · 2024-11-22T11:30:38Z

DEVICES=0,1,2,3
TP_SIZE=4
BATCH_SIZE=4


CUDA_VISIBLE_DEVICES=${DEVICES} \
python /app/tensorrt_llm/examples/medusa/convert_checkpoint.py \
                            --model_dir /app/models/vicuna-33b-v1.3 \
                            --medusa_model_dir /app/models/medusa-vicuna-33b-v1.3 \
                            --output_dir /app/models/medusa_test/tensorrt/${TP_SIZE}-gpu \
                            --dtype float16 \
                            --num_medusa_heads 4 \
                            --tp_size ${TP_SIZE} 


CUDA_VISIBLE_DEVICES=${DEVICES} \
trtllm-build --checkpoint_dir /app/models/medusa_test/tensorrt/${TP_SIZE}-gpu \
             --gpt_attention_plugin float16 \
             --gemm_plugin float16 \
             --context_fmha enable \
             --output_dir /app/models/medusa_test/tensorrt_llm/${TP_SIZE}-gpu \
             --speculative_decoding_mode medusa \
             --max_batch_size ${BATCH_SIZE} \
             --workers ${TP_SIZE}

@hello-11 I use the medusa example here.

rakib-hasan · 2024-12-12T00:19:04Z

Hi @SoundProvider , I just tried to build Medusa engine with Vicuna-33B model with TP=1 and TP=4 using TRT-LLM 0.15 release.
The engines were built without any issues for both TP=1 and TP=4.

Since the error is related to pickle, it seems like your converted checkpoint config is outdated. Could you please try to convert the checkpoint again and then build?

If you are still running into the same issue, can you share which version of TRT-LLM you are using?

SoundProvider · 2024-12-12T04:51:10Z

Hello @rakib-hasan
Thank you for sharing the good news.
Currently I'm working on another issue. I will tag you on this issue when I finish testing what you requested
Thank you

SoundProvider changed the title ~~Medusa example with vicuna 33B~~ Medusa example fails with vicuna 33B Nov 22, 2024

SoundProvider changed the title ~~Medusa example fails with vicuna 33B~~ [bug] Medusa example fails with vicuna 33B Nov 22, 2024

nv-guomingz added the Documentation Improvements or additions to documentation label Dec 9, 2024

github-actions bot added triaged Issue has been triaged by maintainers Investigating labels Dec 9, 2024

nv-guomingz added Documentation Improvements or additions to documentation and removed Documentation Improvements or additions to documentation triaged Issue has been triaged by maintainers Investigating labels Dec 10, 2024

github-actions bot added triaged Issue has been triaged by maintainers Investigating labels Dec 10, 2024

github-actions bot assigned mikemckiernan Dec 10, 2024

nv-guomingz removed Documentation Improvements or additions to documentation triaged Issue has been triaged by maintainers Investigating labels Dec 10, 2024

nv-guomingz unassigned mikemckiernan Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] Medusa example fails with vicuna 33B #2478

[bug] Medusa example fails with vicuna 33B #2478

SoundProvider commented Nov 21, 2024 •

edited

Loading

hello-11 commented Nov 22, 2024

SoundProvider commented Nov 22, 2024

rakib-hasan commented Dec 12, 2024

SoundProvider commented Dec 12, 2024

[bug] Medusa example fails with vicuna 33B #2478

[bug] Medusa example fails with vicuna 33B #2478

Comments

SoundProvider commented Nov 21, 2024 • edited Loading

hello-11 commented Nov 22, 2024

SoundProvider commented Nov 22, 2024

rakib-hasan commented Dec 12, 2024

SoundProvider commented Dec 12, 2024

SoundProvider commented Nov 21, 2024 •

edited

Loading