Skip to content

Segmentation fault when running tritonbench flash attention with --causal #18

@yjk21

Description

@yjk21

Describe the bug

I'm running the benchmarking command from the ws branch, but added the --causal flag, i.e.:

TORCH_CUDA_ARCH_LIST=9.0a cuda-gdb --args python run.py --op flash_attention --only triton_tutorial_flash_v2_ws,triton_tutorial_flash_v2_tma_ws,triton_tutorial_flash_v2 --num-inputs 1 --seq-len 4096 --metrics tflops --batch 8 --n-heads 16 --d-head 128 --causal

I'm seeing a segfault here:

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007d54b9f6b32e in mlir::detail::IROperandBase::insertInto<mlir::IRObjectWithUseList<mlir::OpOperand> > (useList=0x5b34c5d18750, this=0x5b34c5cef190) at /root/.triton/llvm/llvm-b5cc222d-ubuntu-x64/include/mlir/IR/UseDefLists.h:101
101           nextUse->back = &nextUse;

Without the flag it looks WAI.

Environment details

Tritonbench at 3a5dccb159834968567a2e45e561dc1aeaa8f8a8
Meta triton at 67f51cc1420cabeb6bf4d28c1813e38ea9a92e20

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions