Open
Description
Repro:
import torch
from nvfuser import FusionDefinition, DataType
def nvfuser_fusion_id5(fd : FusionDefinition) -> None :
T0 = fd.define_tensor(shape=[1, 1], contiguity=[None, None], dtype=DataType.Int, is_cpu=False, stride_order=[1, 0])
S1 = fd.define_scalar(-2, dtype=DataType.Int)
T7 = fd.ops.pad(T0, [0, 2, 0, 2], S1)
fd.add_output(T7)
with FusionDefinition() as fd:
nvfuser_fusion_id5(fd)
inputs = [
torch.testing.make_tensor((1, 1), dtype=torch.int64, device='cuda:0'),
]
fd.execute(inputs)
Stacktrace:
Error replaying transforms in contiguous ID checker, expected iS10{9} to be in the active ID set.
Exception raised from checkExclusivelyConsumesAllocs at /opt/pytorch/nvfuser/csrc/contiguity.cpp:51 (most recent call first):
frame #0: nvfuser::nvfCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x103 (0x7f3d6bb6010f in /opt/pytorch/nvfuser/nvfuser/_C.cpython-312-x86_64-linux-gnu.so)
frame #1: nvfuser::nvfErrorFail(char const*, char const*, unsigned int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x62 (0x7f3d6bf9fea2 in /opt/pytorch/nvfuser/nvfuser/_C.cpython-312-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x3dd516 (0x7f3d6bdc0516 in /opt/pytorch/nvfuser/nvfuser/_C.cpython-312-x86_64-linux-gnu.so)
Env:
- pjnl-20250218
- H100
This also caused thunder test to fail
pytest -vsx thunder/tests/test_ops.py -k test_core_vs_jax_consistency_pad_nvfuser_cuda_thunder
Metadata
Assignees
Labels
No labels
Activity