[`bnb`] Add fp4 support for dispatch #1505

younesbelkada · 2023-06-01T15:03:30Z

What does this PR do?

This PR applies the similar enhancement as #1228 for FP4 layers

Now the script below outputs the desired dtype:

import torch
from transformers import AutoConfig, AutoModel, BitsAndBytesConfig
from transformers.utils.bitsandbytes import replace_8bit_linear

from accelerate import init_empty_weights
from huggingface_hub import hf_hub_download

from accelerate import load_checkpoint_and_dispatch


with init_empty_weights():
    model = AutoModel.from_config(AutoConfig.from_pretrained("bigscience/bloom-560m"))

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True
)

model = replace_8bit_linear(model, quantization_config=quantization_config)

# For some reason replace_8bit_linear creates parameters with requires_grad=True but it's irrelevant rn
for p in model.parameters():
    p.requires_grad = False

model_path = hf_hub_download("bigscience/bloom-560m", "pytorch_model.bin")

model = load_checkpoint_and_dispatch(
    model,
    checkpoint=model_path,
    device_map="auto",
)

print(f"{model.h[0].self_attention.query_key_value.weight.device}\n{model.h[0].self_attention.query_key_value.weight.dtype}")
>>> torch.uint8

cc @sgugger @BlackSamorez

sgugger · 2023-06-01T15:04:36Z

src/accelerate/utils/modeling.py

+                # quantize only if necessary
+                device_index = torch.device(device).index if torch.device(device).type == "cuda" else None
+                if not getattr(module.weight, "quant_state", None) and device_index is not None:
+                    module.weight = module.weight.cuda(device_index)


The .cuda function is very very deprecated. You should use to.

Hmm I think the way bitsandbytes has designed its Linear4bit layers we need to call cuda: https://github.com/TimDettmers/bitsandbytes/blob/ac5550a0238286377ee3f58a85aeba1c40493e17/bitsandbytes/nn/modules.py#L152 it seems to be the only way to quantize the weights :/ I tried it with to and it didn't worked. (note that at that point module.weight is a bnb.nn.Params4bit module)

Oh ok. Not very PyTorch-ic then.

HuggingFaceDocBuilderDev · 2023-06-01T15:07:42Z

The documentation is not available anymore as the PR was closed or merged.

sgugger · 2023-06-01T15:13:54Z

tests/test_big_modeling.py

+        """Tests that `dispatch_model` quantizes int8 layers"""
+        from huggingface_hub import hf_hub_download
+        from transformers import AutoConfig, AutoModel, BitsAndBytesConfig
+        from transformers.utils.bitsandbytes import replace_8bit_linear


Isn't this function renamed to something else?

ah yes, let me modify that

sgugger · 2023-06-01T15:14:00Z

tests/test_big_modeling.py

+    @slow
+    @unittest.skip("Un-skip in the next transformers release")
+    def test_dipatch_model_fp4_simple(self):
+        """Tests that `dispatch_model` quantizes int8 layers"""


add fp4 support for dispatch

2135f4d

younesbelkada mentioned this pull request Jun 1, 2023

set_module_tensor_to_device doesn't properly deploy BitsAndBytes Linear4bit #1504

Closed

sgugger reviewed Jun 1, 2023

View reviewed changes

add tests

800a050

younesbelkada requested a review from sgugger June 1, 2023 15:11

sgugger reviewed Jun 1, 2023

View reviewed changes

refactor

08fda28

younesbelkada requested a review from sgugger June 1, 2023 15:17

sgugger approved these changes Jun 1, 2023

View reviewed changes

younesbelkada merged commit 8ae56dc into main Jun 1, 2023

younesbelkada deleted the add-dispatch-4bit branch June 1, 2023 18:41

BlackSamorez mentioned this pull request Jun 19, 2023

Question on custom models BlackSamorez/tensor_parallel#88

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`bnb`] Add fp4 support for dispatch #1505

[`bnb`] Add fp4 support for dispatch #1505

younesbelkada commented Jun 1, 2023

sgugger Jun 1, 2023

younesbelkada Jun 1, 2023

sgugger Jun 1, 2023

younesbelkada Jun 1, 2023

HuggingFaceDocBuilderDev commented Jun 1, 2023 •

edited

Loading

sgugger Jun 1, 2023

younesbelkada Jun 1, 2023

sgugger Jun 1, 2023

[bnb] Add fp4 support for dispatch #1505

[bnb] Add fp4 support for dispatch #1505

Conversation

younesbelkada commented Jun 1, 2023

What does this PR do?

sgugger Jun 1, 2023

Choose a reason for hiding this comment

younesbelkada Jun 1, 2023

Choose a reason for hiding this comment

sgugger Jun 1, 2023

Choose a reason for hiding this comment

younesbelkada Jun 1, 2023

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Jun 1, 2023 • edited Loading

sgugger Jun 1, 2023

Choose a reason for hiding this comment

younesbelkada Jun 1, 2023

Choose a reason for hiding this comment

sgugger Jun 1, 2023

Choose a reason for hiding this comment

[`bnb`] Add fp4 support for dispatch #1505

[`bnb`] Add fp4 support for dispatch #1505

HuggingFaceDocBuilderDev commented Jun 1, 2023 •

edited

Loading