Load incorrect element values in bfloat16 tensor on big-endian #448

kiszk · 2024-03-05T09:38:07Z

System Info

>>> safetensors.__version__
'0.4.2'

Information

The official example scripts
My own modified scripts

Reproduction

When I executed transformer models in bfloat16 at HF, I got the incorrect result on s390x. I realized values in weights are different between x86 and s390x. The following is a small reproduction.

Execute the following program on x86

import torch
from safetensors import safe_open
from safetensors.torch import save_file

tensors = {
   "weight1": torch.ones((8, 8), dtype=torch.bfloat16),
}
save_file(tensors, "bf16.safetensors")

read_tensors = {}
with safe_open("bf16.safetensors", framework="pt", device="cpu") as f:
   for key in f.keys():
       read_tensors[key] = f.get_tensor(key)
print(read_tensors)

Copy bf16.safetensors into s390x machine. Then, execute the following program

import torch
from safetensors import safe_open

read_tensors = {}
with safe_open("bf16.safetensors", framework="pt", device="cpu") as f:
   for key in f.keys():
       read_tensors[key] = f.get_tensor(key)
print(read_tensors)

The result is as follows:

{'weight1': tensor([[7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
         7.6294e-06, 7.6294e-06],
        [7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
         7.6294e-06, 7.6294e-06],
        [7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
         7.6294e-06, 7.6294e-06],
        [7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
         7.6294e-06, 7.6294e-06],
        [7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
         7.6294e-06, 7.6294e-06],
        [7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
         7.6294e-06, 7.6294e-06],
        [7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
         7.6294e-06, 7.6294e-06],
        [7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
         7.6294e-06, 7.6294e-06]], dtype=torch.bfloat16)}

Expected behavior

The result on s390x should be as follows:

{'weight1': tensor([[1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1.]], dtype=torch.bfloat16)}

My colleague is curious whether this code works well.

The text was updated successfully, but these errors were encountered:

kiszk · 2024-03-05T14:52:14Z

From PyTorch 2.1, byteswap() function is implemented. Does this function help big-endianness support?

github-actions · 2024-04-05T01:46:38Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Narsil · 2024-07-31T08:12:50Z

Fixed in #507

kiszk mentioned this issue Mar 14, 2024

Fix byteswap for BF16 tensor on big-endian machine #452

Closed

github-actions bot added the Stale label Apr 5, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 10, 2024

Narsil reopened this Jul 31, 2024

Narsil closed this as completed Jul 31, 2024

kiszk mentioned this issue Jul 31, 2024

Improving bf16 tests #507

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load incorrect element values in bfloat16 tensor on big-endian #448

Load incorrect element values in bfloat16 tensor on big-endian #448

kiszk commented Mar 5, 2024

kiszk commented Mar 5, 2024

github-actions bot commented Apr 5, 2024

Narsil commented Jul 31, 2024

Load incorrect element values in bfloat16 tensor on big-endian #448

Load incorrect element values in bfloat16 tensor on big-endian #448

Comments

kiszk commented Mar 5, 2024

System Info

Information

Reproduction

Expected behavior

kiszk commented Mar 5, 2024

github-actions bot commented Apr 5, 2024

Narsil commented Jul 31, 2024