Skip to content

[Issue]: Rocprofv3 kernel trace bug #187

@fsx950223

Description

@fsx950223

Problem Description

*** Aborted at 1773134412 (unix time) try "date -d @1773134412" if you are using GNU date ***
PC: @     0x7f0e47e7291c (unknown)
*** SIGSEGV (@0x7f0beb701340) received by PID 3038338 (TID 0x7f0e3d79a580) from PID 18446744073364575040; stack trace: ***
    @     0x7f0e47d78ed3 (unknown)
    @     0x7f0e48ce6a00 (unknown)
    @     0x7f0e47d1c330 (unknown)
    @     0x7f0e47e7291c (unknown)
    @     0x7f0e482be5e8 (unknown)
    @     0x7f0e4841f754 (unknown)
    @     0x7f0e48482a51 (unknown)
    @     0x7f0e4811bb97 rocprofiler_iterate_buffer_tracing_record_args
    @     0x7f0e48a11727 (unknown)
    @     0x7f0e489fe527 (unknown)
    @     0x7f0e4897df8c (unknown)
    @     0x7f0e4897f69e (unknown)
    @     0x7f0e4813be58 (unknown)
    @     0x7f0e4813c59b (unknown)
    @     0x7f0e47d78ed3 (unknown)
    @     0x7f0e48137c5a (unknown)
    @     0x7f0e48967f29 (unknown)
    @     0x7f0e4896d4c6 (unknown)
    @     0x7f0e47d011ca (unknown)
    @     0x7f0e47d0128b __libc_start_main
    @           0x6574f5 _start
[rocprofv3] Fatal error: Command '['python', 'tests/kernels/test_pa.py']' died with <Signals.SIGSEGV: 11>.

got above error when run rocprofv3 -i input.yaml -- python tests/kernels/test_pa.py

jobs:
   -
       kernel_include_regex: pa_decode_dot_kernel*
       kernel_iteration_range: "[1, [3-4]]"
       output_file: out
       output_directory: pa_ps_profile
       output_format: [json, csv, otf2, pftrace]
       truncate_kernels: true
       sys_trace: true # enable for pftrace and otf2
       advanced_thread_trace: true # enable for att and ui folder
       att_target_cu: 1
       att_shader_engine_mask: "0xf" # collect one CU from 4 SEs
       att_simd_select: "0xf" # collect 4 SIMDs on single CU
       att_buffer_size: "0x6000000"
   -
       pmc: [SQ_WAVES, FETCH_SIZE]

Operating System

Ubuntu

CPU

AMD

GPU

MI308

ROCm Version

rocm7.2

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions