-
Notifications
You must be signed in to change notification settings - Fork 233
Issues: casper-hansen/AutoAWQ
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Same Memory (VRAM) with different batch_size, Prefill Length, Decode Length.
#691
opened Jan 15, 2025 by
rayzr0123
keep getting error regarding missing positional argument 'attention_mask'
#690
opened Jan 14, 2025 by
BBC-Esq
Multi-GPU/CPU offloading is still not working as intended
#689
opened Jan 7, 2025 by
haitham-boxmind
Question: Error when substituting the quantized matrix multiplication operator.
#670
opened Dec 4, 2024 by
grysgreat
probability tensor contains either inf, nan or element < 0
#657
opened Nov 26, 2024 by
alvaropastor7
Does AutoAWQ support to quantize GLM-4-9B-Chat and ChatGLM3-6B two models?
#642
opened Nov 7, 2024 by
shawn9977
After using autoawq to quantify the model, an error occurs when inferring the model
#636
opened Oct 21, 2024 by
xuanzhangyang
Previous Next
ProTip!
Follow long discussions with comments:>50.