Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
examples/grok/README.md
.examples/llama/README.md
).qkv_bias
shape issue for Qwen1.5-32B (convert qwen 110b gptq checkpoint的时候,qkv_bias 的shape不能被3整除 #1589), thanks to the contribution from @Tlntin in fix up qkv.bias error when use qwen1.5-32b-gptq-int4 #1637.fpA_intB
, thanks to the contribution from @JamesTheZ in Fix the error of Ada traits for fpA_intB. #1583.examples/qwenvl/requirements.txt
, thanks to the contribution from @ngoanpv in Update requirements.txt #1248.lora_manager
, thanks to the contribution from @TheCodeWrangler in Fixed rslora scaling in lora_manager #1669.benchmarks/cpp/README.md
for gptManagerBenchmark seems to go into a dead loop with GPU usage 0% #1562 and Cannot process new request: [TensorRT-LLM][ERROR] Assertion failed: LoRA task 0 not found in cache. Please send LoRA weights with request (/home/jenkins/agent/workspace/LLM/main/L0_PostMerge/llm/cpp/tensorrt_llm/batch_manager/peftCacheManager.cpp:182) #1552.