casper-hansen / AutoAWQ Public

Notifications You must be signed in to change notification settings
Fork 233
Star 1.9k

Code
Issues 164
Pull requests 14
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: casper-hansen/AutoAWQ

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

164 Open 270 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Load Qwen/Qwen2-7B-Instruct-AWQ error: RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment

#693 opened Jan 16, 2025 by liuzhishan

Failed to convert Qwen2-VL-7B-Instruct LORA model

#692 opened Jan 16, 2025 by songyang23

Same Memory (VRAM) with different batch_size, Prefill Length, Decode Length.

#691 opened Jan 15, 2025 by rayzr0123

keep getting error regarding missing positional argument 'attention_mask'

#690 opened Jan 14, 2025 by BBC-Esq

Multi-GPU/CPU offloading is still not working as intended

#689 opened Jan 7, 2025 by haitham-boxmind

llama3.1 quantization

#687 opened Dec 31, 2024 by sunjianxide

Deepseek

#686 opened Dec 30, 2024 by ehartford

Support AutoAWQForSequenceClassification ？

#685 opened Dec 27, 2024 by ShelterWFF

Any chance at supporting Nemotron?

#684 opened Dec 19, 2024 by ambroser53

prepare_inputs_for_generation and position_embeddings

#683 opened Dec 17, 2024 by ambroser53

Please support cohere2 model

#682 opened Dec 17, 2024 by Orion-zhen

How to use multiple GPU nodes during quantization

#679 opened Dec 12, 2024 by ghntd

How to convert AWQ matmul to onnxruntime MatmulNBits

#677 opened Dec 9, 2024 by fuyao2024

can I quantize a model to 8bit model using autoawq?

#675 opened Dec 8, 2024 by cute149q

Question: Error when substituting the quantized matrix multiplication operator.

#670 opened Dec 4, 2024 by grysgreat

probability tensor contains either inf, nan or element < 0

#657 opened Nov 26, 2024 by alvaropastor7

"llama_model_load: error loading model: check_tensor_dims: tensor 'token_embd.weight' not found" after computing AWQ scales and applying them to the gguf model

#655 opened Nov 25, 2024 by Autism-al

how to push model to huggingface ?

#646 opened Nov 16, 2024 by new-Sunset-shimmer

"zero_point":False in quant_fig dict

#643 opened Nov 8, 2024 by Cornelii

Does AutoAWQ support to quantize GLM-4-9B-Chat and ChatGLM3-6B two models?

#642 opened Nov 7, 2024 by shawn9977

DeepSeek-Coder-V2-Lite-Instruct Error!

#640 opened Oct 30, 2024 by tohnee

Mixtral training

#639 opened Oct 25, 2024 by ChristianPala

After using autoawq to quantify the model, an error occurs when inferring the model

#636 opened Oct 21, 2024 by xuanzhangyang

ImportError: /root/anaconda3/envs/test2/lib/python3.10/site-packages/awq_inference_engine.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi

#635 opened Oct 20, 2024 by Jackson-cj

[Feature Request]. Support Reward Model

#634 opened Oct 20, 2024 by liziniu

Previous 1 2 3 4 5 6 7 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly