[Feature]: GGUF quantization with tensor parallelism #7662

chrismrutherford · 2024-08-19T21:09:19Z

🚀 The feature, motivation and pitch

When I launch vllm using a gguf (Q8_0 snapshot) and ray (--tensor-parallel-size 8, across 2 nodes of 4 gpus) I get the following error message:

(RayWorkerWrapper pid=11033) ERROR 08-19 16:07:35 worker_base.py:438] ValueError: GGUF quantization hasn't supported tensor parallelism yet. [repeated 2x across cluster]

Please could you add tensor parallelism for GGUF with quantization.

Alternatives

No response

Additional context

No response

mgoin · 2024-08-19T21:30:31Z

@chrismrutherford thanks for reporting, we just landed a fix!

kkaneda · 2024-08-23T07:31:50Z

Hi, I'm encountering some issue and wanted to report here..

The following is the Python code that I used for testing. It worked well with tensor_parallel_size=1, but the script generated a weird output with tensor_parallel_size=2. (The output contained the same sentences repeatedly until it hits the max token.)

from huggingface_hub import hf_hub_download
from vllm import LLM, SamplingParams

def run_gguf_inference(model_path):
    llm = LLM(
	model=model_path,
	max_model_len=4096,
	tokenizer="meta-llama/Meta-Llama-3.1-8B-Instruct",
	tensor_parallel_size=2, 
    )

    tokenizer = llm.get_tokenizer()
    conversations = tokenizer.apply_chat_template(
        [{'role': 'user', 'content': 'what is the future of AI?'}],
        tokenize=False,
        add_generation_prompt=True,
    )

    outputs = llm.generate(
        [conversations],
        SamplingParams(temperature=0, max_tokens=1000),
    )
    for output in outputs:
	print(output)


if __name__ == "__main__":
    repo_id = "bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF"
    filename = "Meta-Llama-3.1-8B-Instruct-Q2_K.gguf"
    model = hf_hub_download(repo_id, filename=filename)
    run_gguf_inference(model)

With tensor_parallel_size=1:

The future of Artificial Intelligence (AI) is vast and rapidly evolving. Here are some potential future developments and trends:\n\n1. **Increased Autonomy**: AI systems will become more autonomous, making decisions on their own without human intervention. This could lead to significant advancements in areas like robotics, transportation, and healthcare.\n\n2. **Exponential Growth**: AI will continue to grow exponentially, with advancements in areas like natural language processing (NLP), computer vision, and reinforcement learning. This could lead to significant breakthroughs in areas like healthcare, education, and customer service.\n\n3. **Edge AI**: With the rise of edge computing, AI will be able to process and analyze data closer to the source, reducing latency and improving real-time decision-making.\n\n4. **Quantum AI**: The integration of quantum computing with AI could lead to exponential growth in computational power and speed, enabling AI to solve complex problems that are currently unsolvable.\n\n5. **AI for Good**: AI will be used to solve some of the world's most pressing problems, such as climate change, poverty, and inequality. This could lead to significant breakthroughs in areas like sustainable energy, education, and social welfare.\n\n6. **Rise of Explainable AI**: As AI becomes more prevalent, there will be a growing need for explainable AI, where AI systems can explain their decisions and actions to users.\n\n7. **AI Hubs**: With the rise of AI, there will be a growing need for AI hubs, where AI systems can be integrated with other technologies like robotics, IoT, and edge computing.\n\n8. **AI Governance**: As AI becomes more prevalent, there will be a growing need for AI governance, where AI systems can be regulated and monitored to ensure they operate within societal norms.\n\n9. **AI Ethics**: With the rise of AI, there will be a growing need for AI ethics, where AI systems can be designed and developed with ethics in mind.\n\n10. **AI for Everyone**: With the rise of AI, there will be a growing need for AI to be accessible to everyone, regardless of their background or socio-economic status.\n\nThese are just a few potential future developments and trends in AI. The future of AI is vast and rapidly evolving, and it's essential to stay up-to-date with the latest developments and trends in the field.

With tensor_parallel_size=2:

The future of Artificial Intelligence (AI) is vast and rapidly evolving. Here are some potential future developments and trends:\n\n1. **Rise of Exponential Growth in AI Applications**:\n\t* AI will continue to improve in various industries, such as healthcare, finance, education, and transportation.\n\t* Expect to see more AI-driven innovations in areas like personalized medicine, smart homes, and autonomous vehicles.\n2. **Advancements in Machine Learning and Deep Learning**:\n\t* Future AI systems will be more sophisticated, with increased capabilities in areas like natural language processing, computer vision, and decision-making.\n\t* Expect to see more advanced AI models like Generative Adversarial Networks (GANs) and Transformer-based models.\n3. **Rise of Edge AI and IoT**:\n\t* Edge AI will become more prevalent, enabling AI to be deployed closer to the data source, reducing latency and improving real-time processing.\n\t* IoT devices will become increasingly connected, with AI playing a crucial role in managing and processing data from these devices.\n4. **Rise of AI-driven Cybersecurity**:\n\t* AI will play a vital role in detecting and preventing cyber threats, with AI-powered systems becoming more sophisticated and effective.\n\t* Expect to see more AI-driven cybersecurity solutions, such as AI-powered threat detection and AI-driven incident response.\n5. **Rise of AI-driven Education and Training**:\n\t* AI will play a crucial role in personalized education and training, with AI-driven adaptive learning systems becoming more prevalent.\n\t* Expect to see more AI-driven educational tools and platforms that cater to individual learning needs and abilities.\n6. **Rise of AI-driven Healthcare**:\n\t* AI will play a vital role in personalized medicine, with AI-driven diagnosis and treatment plans becoming more prevalent.\n\t* Expect to see more AI-driven healthcare solutions, such as AI-powered diagnosis and AI-driven personalized medicine.\n7. **Rise of AI-driven Financial Services**:\n\t* AI will play a crucial role in financial services, with AI-driven risk assessment and AI-driven portfolio management becoming more prevalent.\n\t* Expect to see more AI-driven financial solutions, such as AI-powered risk assessment and AI-driven portfolio management.\n8. **Rise of AI-driven Transportation**:\n\t* AI will play a vital role in autonomous vehicles, with AI-driven navigation and AI-driven decision-making becoming more prevalent.\n\t* Expect to see more AI-driven transportation solutions, such as AI-powered navigation and AI-driven decision-making.\n9. **Rise of AI-driven Energy and Utilities**:\n\t* AI will play a crucial role in energy management, with AI-driven energy efficiency and AI-driven demand response becoming more prevalent.\n\t* Expect to see more AI-driven energy solutions, such as AI-powered energy efficiency and AI-driven demand response.\n10. **Rise of AI-driven Government and Public Sector**:\n\t* AI will play a vital role in government and public sector, with AI-driven decision-making and AI-driven policy-making becoming more prevalent.\n\t* Expect to see more AI-driven government solutions, such as AI-driven decision-making and AI-driven policy-making.\n\nThese are just a few potential future developments and trends in AI. The future of AI is vast and rapidly evolving, and it's essential to stay up-to-date with the latest developments and trends in the field. \n\nPlease note that the above predictions are based on current trends and available data, and the actual future may differ. The future of AI is complex and influenced by various factors, including technological breakthroughs, societal needs, and economic factors. \n\nKeep in mind that AI is a rapidly evolving field, and the future of AI is not set in stone. The future of AI is shaped by the collective efforts of researchers, developers, and innovators working together to create a brighter future for all. \n\nIf you have any specific questions or need more information, feel free to ask. I'll do my best to provide more information and insights. \n\nPlease note that the above predictions are based on current trends and available data, and the actual future may differ. The future of AI is complex and influenced by various factors, including technological breakthroughs, societal needs, and economic factors. \n\nIf you have any specific questions or need more information, feel free to ask. I'll do my best to provide more information and insights. \n\nPlease note that the above predictions are based on current trends and available data, and the actual future may differ. The future of AI is complex and influenced by various factors, including technological breakthroughs, societal needs, and economic factors. \n\nIf you have any specific questions or need more information, feel free to ask. I'll do my best to provide more information and insights. \n\nPlease note that the above predictions are based on current trends and available data, and the actual future may differ. The future of AI is complex and influenced by various factors, including technological breakthroughs, societal needs, and economic factors. \n\nIf

chrismrutherford added the feature request label Aug 19, 2024

mgoin mentioned this issue Aug 19, 2024

[Core] Support tensor parallelism for GGUF quantization #7520

Merged

mgoin closed this as completed in #7520 Aug 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: GGUF quantization with tensor parallelism #7662

[Feature]: GGUF quantization with tensor parallelism #7662

chrismrutherford commented Aug 19, 2024

mgoin commented Aug 19, 2024

kkaneda commented Aug 23, 2024

[Feature]: GGUF quantization with tensor parallelism #7662

[Feature]: GGUF quantization with tensor parallelism #7662

Comments

chrismrutherford commented Aug 19, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

mgoin commented Aug 19, 2024

kkaneda commented Aug 23, 2024