Skip to content

WebGPUPathTracer: Calculate detailed sample counts#739

Merged
gkjohnson merged 6 commits intogkjohnson:webgpu-pathtracerfrom
TheBlek:webgpu-sample-counts
Mar 10, 2026
Merged

WebGPUPathTracer: Calculate detailed sample counts#739
gkjohnson merged 6 commits intogkjohnson:webgpu-pathtracerfrom
TheBlek:webgpu-sample-counts

Conversation

@TheBlek
Copy link
Copy Markdown

@TheBlek TheBlek commented Mar 2, 2026

  • Function to calculate detailed sample counts: min, max, avg by querying sampleCountTarget to the cpu
  • Calculate and display sample counts per second in viewerTest example

This allows us to accurately see sample counts for wavefront case and hence compare performance between two approaches.

@TheBlek TheBlek changed the title WebGPU: Calculate detailed sample counts WebGPUPathTracer: Calculate detailed sample counts Mar 2, 2026
@gkjohnson gkjohnson added this to the v0.0.25 milestone Mar 2, 2026
@gkjohnson
Copy link
Copy Markdown
Owner

I think this is a good direction to head - I'm somewhat surprised to see that when running the examples the wavefront approach seems to logged as significantly faster. I'm wondering if the average is particularly dominated by rays that don't hit the model (or only need one bounce) and therefore finish incredibly quickly. I'm thinking it would be good to display the min samples found as well as the average and max so we can get a better picture.

Is it also deliberate that the samples be tallied every frame after 5 seconds have passed? One concern I have is that this kind of measurement (data readback, cpu tallying) will itself take time and depending on how the wavefront kernel is "sliced" (1 ray iteration per frame vs 10 ray iterations) it will mean that this tally will happen at a different rate for each backend. It might be better to perform the tally once every X seconds to ensure consistency.

@TheBlek
Copy link
Copy Markdown
Author

TheBlek commented Mar 3, 2026

I'm wondering if the average is particularly dominated by rays that don't hit the model (or only need one bounce) and therefore finish incredibly quickly.

That is very likely. I tried to test the performance by zooming in on the image so that there is little background. Even then wavefront was showing better performance. But still this is skewed because of simple paths. Some kind of convergence benchmark would help alleviate those this.

However confirms that wavefront handles non-uniform paths much better.

Modified the example to query the information once every N seconds

Comment on lines +164 to +170
const elapsed = pathTracer.getRenderTime() / 1000;
if ( elapsed < detailedSampleInterval ) {

detailedSampleCount = null;
lastDetailedSample = 0;

}
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding a comment here that says this is just reseting the sample count state

Comment on lines +338 to +339
const tempTarget = new RenderTarget( width, height, { format: sampleCountTarget.format, type: sampleCountTarget.type } );
tempTarget.textures[ 0 ] = sampleCountTarget;
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to cause a memory leak? Does three.js create a new internal render target setup when assigning the underlying texture like this?

let totalSamples = 0;
let minSamples = Number.MAX_VALUE;
let maxSamples = - Number.MAX_VALUE;
for ( let i = 0; i < uintBuffer.length; i ++ ) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I often use statically-declared lengths to avoid the overhead of accessing the ".length" member every iteration of the check, especially with long loops. It's a small optimization but still worthwhile & it's used throughout three.js, as well:

for ( let i = 0, l = uintBuffer.length; i < l; i ++ ) {


} );

}
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's maybe a bit picky but I'm feeling like we need to account for the remainder milliseconds here. If we get the sample count every 4 seconds, for example, and we hit this condition at 4.1 seconds then that's 100ms that's gone and means the next interval will actually be waiting for 8.1 seconds to pass. Over time this will build up and result in the amount of queries made for wavefront vs megakernal will possibly be out of sync after the same amount of time.

Setting lastDetailedSample to the following should work:

lastDetailedSample = Math.floor( elapsed / detailedSampleInterval ) * detailedSampleInterval;

@gkjohnson
Copy link
Copy Markdown
Owner

🎉

@gkjohnson gkjohnson merged commit f4fd1f8 into gkjohnson:webgpu-pathtracer Mar 10, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants