Skip to content

ReadOnlySpan<char>.Trim for small inputs is few times slower on Linux #13669

@adamsitnik

Description

@adamsitnik

ReadOnlySpan<char>.Trim for small inputs is few times slower on Linux

Slower Lin/Win Win Median (ns) Lin Median (ns) Modality
System.Memory.ReadOnlySpan.Trim(input: "") 6.45 3.36 21.68
System.Memory.ReadOnlySpan.Trim(input: "abcdefg") 5.67 4.82 27.33
System.Memory.ReadOnlySpan.Trim(input: " abcdefg ") 3.32 7.78 25.83

Benchmark:

https://github.com/dotnet/performance/blob/8b23cabe793b4ff73a9b28c7dd092b11dc17b197/src/benchmarks/micro/corefx/System.Memory/ReadOnlySpan.cs#L77-L79

git clone https://github.com/dotnet/performance.git
python3 ./performance/scripts/benchmarks_ci.py -f netcoreapp5.0 --filter System.Memory.ReadOnlySpan.Trim

I've created a very small repro app:

class Program
{
    static int Main(string[] args)
    {
        int result = 0;

        ReadOnlySpan<char> span = string.Empty.AsSpan();
        for (int i = 0; i < 1_000_000_000; i++)
        {
            result ^= TrimSourceCopied(span).Length;
        }

        return result;
    }
    
    private static ReadOnlySpan<char> TrimSourceCopied(ReadOnlySpan<char> span)
    {
        int start = 0;
        for (; start < span.Length; start++)
        {
            if (!char.IsWhiteSpace(span[start]))
            {
                break;
            }
        }

        int end = span.Length - 1;
        for (; end > start; end--)
        {
            if (!char.IsWhiteSpace(span[end]))
            {
                break;
            }
        }

        return span.Slice(start, end - start + 1);
    }
}

And using VTune I was able to narrow down the problem to struct copying:

The body of the Main method on Ubuntu 18.04:

image

Please mind the body of the loop:

image

When I change TrimSourceCopied to accept the Span as readonly ref parameter:

private static ReadOnlySpan<char> TrimSourceCopied(in ReadOnlySpan<char> span)

image

System.Memory.ReadOnlySpan.Trim is only one example of a benchmark that uses Span a lot and is slower on Linux compared to Windows. This pattern|problem is quite common.

/cc @AndyAyersMS

category:cq
theme:structs
skill-level:expert
cost:medium

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIoptimizationos-linuxLinux OS (any supported distro)tenet-performancePerformance related issue

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions