Conversation
|
An additional note: this kind of wall-time/all tasks profiler is also implemented in Go (and denoted as goroutine profiler there), so there is some precedent for this in other languages as well: https://github.com/felixge/fgprof. |
|
@nickrobinson251 I can't assign you as reviewer... Feel free to assign yourself or post review comments otherwise. |
6b80fe3 to
f5c8f5f
Compare
|
I think this is related to #55103. Could the metrics here be useful in that too? |
1e9f41f to
6cd27d7
Compare
f6ea007 to
1029a84
Compare
0d4ca9c to
e493403
Compare
For diagnosing excessive scheduling time? I can't immediately see how this PR would be useful for that. |
#55103 seems like a much more direct approach for doing so, at least. |
90bca24 to
14766d3
Compare
14766d3 to
b9f0f1d
Compare
5ddd5ba to
c9d1995
Compare
15e041c to
5ca6bf4
Compare
5ca6bf4 to
8ffacd7
Compare
|
@vtjnash: IIRC your comments stem from the fact that a GC may get interleaved with Task freeing happens in Perhaps we could:
Any thoughts? |
|
A few more workloads suggested by @NHDaly. Workload 3: compute_heavy.jlusing Base.Threads
using Profile
using PProf
ch = Channel(1)
const MAX_ITERS = (1 << 22)
const N_TASKS = (1 << 12)
function spawn_a_task_waiting_on_channel()
Threads.@spawn begin
take!(ch)
end
end
function sum_of_sqrt()
sum_of_sqrt = 0.0
for i in 1:MAX_ITERS
sum_of_sqrt += sqrt(i)
end
return sum_of_sqrt
end
function spawn_a_bunch_of_compute_heavy_tasks()
Threads.@sync begin
for i in 1:N_TASKS
Threads.@spawn begin
sum_of_sqrt()
end
end
end
end
function main()
spawn_a_task_waiting_on_channel()
spawn_a_bunch_of_compute_heavy_tasks()
end
Profile.@profile_walltime main()Expected resultsWe have a lot more compute-heavy tasks than sleeping tasks. We expect to see a lot of samples in
|
|
Cool, thanks! 🎉 🤔 Is it expected that the currently-scheduled tasks seem to have their stacks starting at a different frame than the waiting tasks? It looks like the executing tasks start with right with the function in the Task ( I can't decide if I think this is helpful or not. On the one hand, it's maybe nice to visually divide the scheduled vs sleeping tasks, but on the other hand i think it would make more sense to integrate the stacks together if they had most of their content shared. |
|
My instinct is that this is not desirable, and we should figure out why they're different, and correct that. |
Good question, I don't know. Will investigate this. |
|
Wrote this short demo on how to use the profiler: https://github.com/d-netto/Wall-time-Profiler-Demo. Feedback is welcome. Perhaps we should add this to some documentation page in Julia itself. |
|
Can you add this to the Profile docs and the NEWS file? |
Sounds good, will do. Thanks for the reviews. |
|
This looks really interesting, I'm excited to use it! To confirm, there's no way to use this with Are there plans to make this modifiable using e.g. env variables like for heap snapshots or function references like [1] Line 818 in fcf7ec0 |
Your understanding is correct.
i think that could be a good addition! Perhaps open a feature request issue for it? We might even want both e.g. I suppose an Also, you might also be interested in #56043 (which depending on your use-case might be even better than a wall-time profile, since it'll print the backtrace for all tasks not just a sample) |



One limitation of sampling CPU/thread profiles, as is currently done in Julia, is that they primarily capture samples from CPU-intensive tasks.
If many tasks are performing IO or contending for concurrency primitives like semaphores, these tasks won’t appear in the profile, as they aren't scheduled on OS threads sampled by the profiler.
A wall-time profiler, like the one implemented in this PR, samples tasks regardless of OS thread scheduling. This enables profiling of IO-heavy tasks and detecting areas of heavy contention in the system.
Co-developed with @nickrobinson251.