Unable to cast Python instance of type <class 'torch._subclasses.fake_tensor.FakeTensor'> to C++ type #1351

zwhe99 · 2024-11-21T14:49:44Z

[rank7]:   File "/data/miniconda3/envs/o1/lib/python3.10/site-packages/torch/_ops.py", line 1116, in __call__
[rank7]:     return self._op(*args, **(kwargs or {}))
[rank7]: torch._dynamo.exc.TorchRuntimeError: Failed running call_function flash_attn._flash_attn_varlen_forward(*(FakeTensor(..., device='cuda:7', size=(2982, 32, 128), dtype=torch.bfloat16,
[rank7]:            grad_fn=<AsStridedBackward0>), FakeTensor(..., device='cuda:7', size=(2982, 8, 128), dtype=torch.bfloat16,
[rank7]:            grad_fn=<AsStridedBackward0>), FakeTensor(..., device='cuda:7', size=(2982, 8, 128), dtype=torch.bfloat16,
[rank7]:            grad_fn=<AsStridedBackward0>), FakeTensor(..., device='cuda:7', size=(10,), dtype=torch.int32), FakeTensor(..., device='cuda:7', size=(10,), dtype=torch.int32), FakeTensor(..., device='cuda:7', size=(), dtype=torch.int64), F
akeTensor(..., device='cuda:7', size=(), dtype=torch.int64), 0.0, 0.08838834764831845), **{'causal': True, 'window_size_left': -1, 'window_size_right': -1, 'softcap': 0.0, 'alibi_slopes': None, 'return_softmax': False, 'block_table': None}):
[rank7]: flash_attn::_flash_attn_varlen_forward() Expected a value of type 'int' for argument 'max_seqlen_q' but instead found type 'FakeTensor'.
[rank7]: Position: 5
[rank7]: Value: FakeTensor(..., device='cuda:7', size=(), dtype=torch.int64)
[rank7]: Declaration: flash_attn::_flash_attn_varlen_forward(Tensor q, Tensor k, Tensor v, Tensor cu_seqlens_q, Tensor cu_seqlens_k, SymInt max_seqlen_q, SymInt max_seqlen_k, float dropout_p, float softmax_scale, bool causal, SymInt window_size
_left=-1, SymInt window_size_right=-1, float softcap=0., Tensor? alibi_slopes=None, bool return_softmax=False, Tensor? block_table=None, Tensor? leftpad_k=None, Tensor? seqused_k=None) -> (Tensor, Tensor, Tensor, Tensor)
[rank7]: Cast error details: Unable to cast Python instance of type <class 'torch._subclasses.fake_tensor.FakeTensor'> to C++ type '?' (#define PYBIND11_DETAILED_ERROR_MESSAGES or compile in debug mode for details)

[rank7]: from user code:
[rank7]:    File "/data/miniconda3/envs/o1/lib/python3.10/site-packages/transformers/modeling_flash_attention_utils.py", line 280, in torch_dynamo_resume_in__flash_attention_forward_at_273
[rank7]:     attn_output = flash_attn_varlen_func(
[rank7]:   File "/data/miniconda3/envs/o1/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 1407, in flash_attn_varlen_func
[rank7]:     return FlashAttnVarlenFunc.apply(
[rank7]:   File "/data/miniconda3/envs/o1/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 896, in forward
[rank7]:     out_padded, softmax_lse, S_dmask, rng_state = _wrapped_flash_attn_varlen_forward(

[rank7]: Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information


[rank7]: You can suppress this exception and fall back to eager by setting:
[rank7]:     import torch._dynamo
[rank7]:     torch._dynamo.config.suppress_errors = True

fzyzcjy · 2024-11-29T05:22:48Z

+1 I am seeing the same issue. Is there any updates?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to cast Python instance of type <class 'torch._subclasses.fake_tensor.FakeTensor'> to C++ type #1351

Unable to cast Python instance of type <class 'torch._subclasses.fake_tensor.FakeTensor'> to C++ type #1351

zwhe99 commented Nov 21, 2024

fzyzcjy commented Nov 29, 2024

Unable to cast Python instance of type <class 'torch._subclasses.fake_tensor.FakeTensor'> to C++ type #1351

Unable to cast Python instance of type <class 'torch._subclasses.fake_tensor.FakeTensor'> to C++ type #1351

Comments

zwhe99 commented Nov 21, 2024

fzyzcjy commented Nov 29, 2024