Skip to content

[Bug] Hang after load with stride (32 bit ew) #58

@callme-sam

Description

@callme-sam

Description

I created a test case to reproduce a hang that occurs after a strided vector load (vlse32.v) when no hardware barrier is inserted.

The test case tests a linear algebra kernel (matrix multiplication with transposed B).
The main.c computes the result and compares it against a golden model.
Input data and expected output are provided in data.h and were generated via a Python script.

How to reproduce

  • Extract the attached zip archive into sw/spatzBenchmarks

  • Add the following lines to sw/spatzBenchmarks/CMakeLists.txt:

add_library(mat-mul-trans-B mat-mul-trans-B/kernel/mat-mul-trans-B.c)
add_spatz_test_threeParam(mat-mul-trans-B mat-mul-trans-B/main.c 64 32 64)
  • Build and run

Expected Behaviour

The code hangs after executing a strided vector load (vlse32.v).
Inserting a hardware barrier immediately after the load avoids the hang.

Workaround

Uncommenting the hardware barrier after the strided load fixes the issue:

asm volatile ("vlse32.v v16, (%0), %1" :: "r"(col_b1), "r"(stride));
asm volatile ("vlse32.v v24, (%0), %1" :: "r"(col_b1_next), "r"(stride));
snrt_cluster_hw_barrier();

Recompiling with the barrier enabled allows the test to run correctly.

# Loading entry point: 80000000
#
##################################### MATRIX_MUL_TRANS_B TEST ####################################
#
# INFO | Running 'matrix_mul_trans_B' test on Spatz Cluster
# INFO | Test SUCCESS
#
##########################################################################################
#
# ** Info: [SUCCESS] Program finished successfully

mat-mul-trans-B.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions