Skip to content

Fix spurious BROKEN frame at top of Linux thread stacks in CPU Stacks viewer#2375

Merged
brianrob merged 2 commits intomainfrom
copilot/fix-broken-frame-in-cpu-stacks
Mar 11, 2026
Merged

Fix spurious BROKEN frame at top of Linux thread stacks in CPU Stacks viewer#2375
brianrob merged 2 commits intomainfrom
copilot/fix-broken-frame-in-cpu-stacks

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 9, 2026

Linux EventPipe traces show a spurious BROKEN frame between the thread root and the actual bottom frame (libc.so.6!__clone3) because ReasonableTopFrame() only recognized Windows thread-root modules (ntdll.dll, ntoskrnl.exe) as valid stack tops.

Changes

  • ReasonableTopFrame() in TraceEventStackSource: Recognize libc module variants as valid thread root modules — matches libc, libc.so, libc.so.6, libc-2.31.so, etc., but not unrelated libs like libcrypto.
string moduleFileName = moduleFile.Name;
if (string.Compare(moduleFileName, "libc", StringComparison.OrdinalIgnoreCase) == 0 ||
    moduleFileName.StartsWith("libc.", StringComparison.OrdinalIgnoreCase) ||
    moduleFileName.StartsWith("libc-", StringComparison.OrdinalIgnoreCase))
{
    m_goodTopModuleIndex = moduleFileIndex;
    return true;
}
  • GetCallStack() in MutableTraceEventStackSource: Mirror the same libc check to prevent insertion of a BROKEN frame when the bottom-most frame is in libc.
bool isNtdll = 5 <= bangIdx && string.Compare(frameName, bangIdx - 5, "ntdll", 0, 5, StringComparison.OrdinalIgnoreCase) == 0;
// Match libc, libc.so, libc.so.6, libc-2.31.so, but not libcrypto etc.
bool isLibc = 4 <= bangIdx &&
    string.Compare(frameName, 0, "libc", 0, 4, StringComparison.OrdinalIgnoreCase) == 0 &&
    (bangIdx == 4 || frameName[4] == '.' || frameName[4] == '-');
if (!isNtdll && !isLibc) { /* insert BROKEN frame */ }
Original prompt

This section details on the original issue you should resolve

<issue_title>Spurious BROKEN frame at top of Linux thread stacks in CPU Stacks viewer</issue_title>
<issue_description>When viewing CPU stacks from Linux EventPipe .nettrace traces (e.g., collected by one-collect), every thread stack shows a BROKEN frame between the thread/process root and the actual bottom frame (libc.so.6!__clone3), even though the stack is fully intact:

   ROOT
   + Process32 dotnet (97856)
    + Thread (97868) CPU=2879ms
    |+ BROKEN                          ← spurious
    | + libc.so.6!__clone3
    |  + libc.so.6!start_thread
    |   + libcoreclr.so!CorUnix::CPalThread::ThreadEntry(void*)
    |    ...

Possible root cause:
ReasonableTopFrame() in TraceEventStacks.cs determines whether the deepest frame of a call stack is a valid thread entry point. It only recognizes ntdll.dll (Windows) and ntoskrnl.exe (kernel processes) as valid thread roots. On Linux, thread stacks bottom out in libc.so.6!__clone3 (or __clone, __libc_start_main), which fails this check, so every Linux thread stack is unconditionally routed through the "broken stack" encoding.

Possible fix:
Add libc (matching libc.so.6, libc.so, etc.) as a recognized thread-root module in ReasonableTopFrame(), alongside ntdll.dll.
e.g. in

private bool ReasonableTopFrame(StackSourceCallStackIndex callStackIndex, ThreadIndex threadIndex)

                // On Linux, threads start from libc's clone/clone3 or __libc_start_main.
                string fileName = moduleFile.Name;
                if (string.Compare(fileName, "libc", StringComparison.OrdinalIgnoreCase) == 0 ||
                    fileName.StartsWith("libc.", StringComparison.OrdinalIgnoreCase))
                {
                    m_goodTopModuleIndex = moduleFileIndex;
                    return true;
                }
```</issue_description>

## Comments on the Issue (you are @copilot in this section)

<comments>
</comments>

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

… viewer

Co-authored-by: brianrob <6210322+brianrob@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix spurious BROKEN frame in Linux thread stacks viewer Fix spurious BROKEN frame at top of Linux thread stacks in CPU Stacks viewer Mar 9, 2026
@brianrob brianrob marked this pull request as ready for review March 11, 2026 19:48
@brianrob
Copy link
Copy Markdown
Member

Confirmed fixed. The before image has 39 seconds of samples where stacks are broken - the top hitter being libc.so.6!__clone3 which is a valid stack thread base. The after image shows that this is fixed and there is only 130ms of time where stacks are broken - the frames implicated aren't valid stack thread base methods.

Before:
image
After:
image

Copy link
Copy Markdown
Collaborator

@leculver leculver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't generally love Regex as a solution most of the time, but I think you might solidly be in the realm of using a regex here. It would be weird, but plenty of things could match "libc[.-].*.so", which is effectively what you are doing. Usually regex is the wrong option, but better validation wouldn't hurt.

I think matching $@"^libc(-\d+\.\d+)?\.so(\.\d+)?" is what you would want.

I don't think this will hurt anything, so I'm marking approve. I leave it up to you if you want to make the change or just merge this. I haven't considered whether ".so" is always appended or if you truncate that to "libc-3.4" (no .so), and what happens in the .so.68 case, so it might require work workshopping beyond my regex...so don't overcomplicate it if you don't want to.

@brianrob
Copy link
Copy Markdown
Member

I don't generally love Regex as a solution most of the time, but I think you might solidly be in the realm of using a regex here. It would be weird, but plenty of things could match "libc[.-].*.so", which is effectively what you are doing. Usually regex is the wrong option, but better validation wouldn't hurt.

I think matching $@"^libc(-\d+\.\d+)?\.so(\.\d+)?" is what you would want.

I don't think this will hurt anything, so I'm marking approve. I leave it up to you if you want to make the change or just merge this. I haven't considered whether ".so" is always appended or if you truncate that to "libc-3.4" (no .so), and what happens in the .so.68 case, so it might require work workshopping beyond my regex...so don't overcomplicate it if you don't want to.

Thanks @leculver. I'm going to stick with the current change. I looked into the Regex option, but it's a bit more complex than I'd like here. The risk is also quite low here since this just controls whether or not we render a BROKEN frame. The risk is of a false positive and we don't explicitly render a BROKEN frame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spurious BROKEN frame at top of Linux thread stacks in CPU Stacks viewer

3 participants