Allow printing task backtraces via profiling peek mechanism by Drvi · Pull Request #56043 · JuliaLang/julia

Drvi · 2024-10-08T11:50:42Z

This would be useful e.g. for debugging stuck tests run via ReTestItems.jl (which uses multiple Julia processes to run the tests)

vtjnash · 2024-10-08T15:17:53Z

base/Base.jl

        while _trywait(cond)
            profile = @something(profile, require_stdlib(PkgId(UUID("9abbd945-dff8-562f-b5e8-e1ebf5ef1b79"), "Profile")))::Module
            invokelatest(profile.peek_report[])
+            if Base.get_bool_env("JULIA_PROFILE_PEEK_TASK_BACKTRACES", false) === true


I think it is worth making this the default even

Suggested change

if Base.get_bool_env("JULIA_PROFILE_PEEK_TASK_BACKTRACES", false) === true

if Base.get_bool_env("JULIA_PROFILE_PEEK_TASK_BACKTRACES", true) === true

It would also potentially be great to put this in signals-unix.c : signal_listener directly, so it operates on fatal signals too

I've changed the PR so that we print the backtraces by default 👍 As for calling this from the signal handler directly -- do you mean we should do also print these for different signals than SIGUSR1/SIGINFO? I think I'd need some guidance with that as I'm not sure which signals would be suitable to be included.

FYI: at least for us, with our huge sysimg, this can be really slow. Like >5 minutes slow.

So if that impacts others too, do we want to do this by default?

Why does system image size affect this? Yes, that would be the list of all "critical" signals defined there. There's already a place we call jl_print_bt_entry_codeloc on the threads, so it would just tack on right after that.

We believe the reason is because symbolizing the traces is super slow with our giant binary size. But to be honest i think we don't know why it's so slow.

Ah, gotcha, that does make sense. Symbolizing something big (such as the LLVM debug info when asserts are on) can take quite some time.

Although that happens in parallel with the runtime, so I think that still might be okay, for something the user has to request pretty explicitly

vtjnash · 2024-10-08T15:19:40Z

base/Base.jl

            invokelatest(profile.peek_report[])
+            if Base.get_bool_env("JULIA_PROFILE_PEEK_TASK_BACKTRACES", false) === true
+                println(stderr, "Printing Julia task backtraces...")
+                ccall(:jl_print_task_backtraces, Cvoid, ())


This prints to a different abstraction over stderr than the other println calls here, so may need to explicitly call flush around it to ensure the output doesn't get jumbled

Aside, it can also be better to use print(stderr, "<text>\n") instead of println where reliable async behavior is desired, since it ensures the whole text prints in the same syscall

This is very helpful, thanks!

vtjnash

Conditional approval, since it would probably be better to print this async (in signal_listener) immediately instead of waiting for 1 second and then printing just before the next next yield point finally returns to the current Task (here), which could be a very long delay

vtjnash reviewed Oct 8, 2024

View reviewed changes

Drvi force-pushed the td-profile-peek-task-backtraces branch from db82a3c to 460bab6 Compare October 9, 2024 12:56

Print task backtraces by default

0998426

Drvi force-pushed the td-profile-peek-task-backtraces branch from 460bab6 to 0998426 Compare October 9, 2024 12:56

vtjnash approved these changes Oct 10, 2024

View reviewed changes

nickrobinson251 mentioned this pull request Oct 27, 2024

Wall-time/all tasks profiler #55889

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow printing task backtraces via profiling peek mechanism#56043

Allow printing task backtraces via profiling peek mechanism#56043
Drvi wants to merge 1 commit intoJuliaLang:masterfrom
Drvi:td-profile-peek-task-backtraces

Drvi commented Oct 8, 2024

Uh oh!

vtjnash Oct 8, 2024

Uh oh!

Drvi Oct 9, 2024 •

edited

Loading

Uh oh!

NHDaly Oct 9, 2024

Uh oh!

vtjnash Oct 10, 2024

Uh oh!

NHDaly Oct 18, 2024

Uh oh!

vtjnash Oct 19, 2024

Uh oh!

vtjnash Oct 19, 2024

Uh oh!

vtjnash Oct 8, 2024

Uh oh!

Drvi Oct 8, 2024

Uh oh!

vtjnash left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	if Base.get_bool_env("JULIA_PROFILE_PEEK_TASK_BACKTRACES", false) === true
	if Base.get_bool_env("JULIA_PROFILE_PEEK_TASK_BACKTRACES", true) === true

Uh oh!

Conversation

Drvi commented Oct 8, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Drvi Oct 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vtjnash left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Drvi Oct 9, 2024 •

edited

Loading